How to Secure Your Source Code
Industry Report
Intro: Breaching to the Choir
When a sensitive data breach at an important company like Equifax makes headline news,millions of consumers become immediately aware that they’re now victims. The story is always about stolen data and the dramaaround the company’s attempt to cover up the breach. But compromised data is a consequence of a security failure. What are the actual causes of such a severe security breach?
Oftentimes, sensitive data is compromised through insecure sourcecode. That’s the story within the story, thecompelling story rarely told. Such failures occur frequently, even within the biggest and most familiar companies likeUber. Security failures even occur inside companies specializing in security, like OneLogin, but we rarelyhear whythese failures happened.
Oftentimes, sensitive data is compromised through insecure source code. That’s the story within the story, the compelling story rarely told.
This is true in part because source code vulnerability is highlytechnical. As we willsee, this is one issue we cannotafford to avoid because it is “too technical.” But there is a more nefarious reason we don’t hear about the cause ofsecurity failures: if companies revealed that their sofware lacked important security features, then they would quickly lose customer confidence. For this reason, Yahoo and other companies have intentionally concealed security breaches for years.
We will find that the source of these security breaches is the source code itself. In this article, we will explore the security vulnerabilities of source code repositories such as Git and Apache Subversion. We will also discuss a variety of solutions including source code scanning. Let’s start with a look at some recent security failuresand uncover the common denominator.
Hacks in the News
In late 2016, a total of 57 million Uber users and drivers fell victim to one of the largest data breaches in recent history.Two hackers made off with personal information,including phone numbers, names, and email addresses. What’s worse, however, is that the company covered up the breach for more than a year’s time. Settling claimsassociatedwith the cover-up will cost Uber $148 million.
The absurd but true answer is that bad actors ofen don’t hack anything. The most damaging “hacks” involve little more ingenuity than the accidental discovery of passwords in an unencrypted file, hosted in an indexable web directory or an Amazon S3 bucket! This is what happened at Uber, and we’ll be taking a more in-depth look at how attackers get access to login credentials through vulnerabilities in versioning platforms and code repositories like Git and Subversion.
Take for example another high-profile breach that impacted more than 148 million consumers: the breach of Equifax, a consumer credit-reporting agency. The massive attack on Equifax exploited another popular developer tool for distributed computing apps called Apache Struts. Apache is well known for producing high-quality and widely popular open source developer tools; however, Apache Struts has a long list of known security issues. One developer forum currently maintains a post of 72 known security issues! Far from moderate in scope, these security vulnerabilities span denial of service and remote code execution attacks, which can cripple an enterprise ecommerce platform.
This prompts the most natural question: if Apache Struts has known security issues, why was Equifax using it? It’s widely stated that Equifax knew about those security issues long before they were exploited, but Equifax did not take any deliberate action to resolve the issues. Why? To understand the problem with handling known bugs and attack vectors in such commonly used software, we must explore the technical side of the issue.
Even companies whose sole focus is the security of data are not immune. For example, OneLogin, a company whose whole raison d’être is web security, was attacked. And this particular story is most germane to our intended purposes here: thef of API secret keys and OAuth tokens. Although OneLogin did not reveal details of the attack, their recommendations to customers on protecting further breaches indirectly revealed the potential points of ingress. The login credentials that developers use for automated logins—API keys and OAuth tokens—are ofen stored unencrypted in shell scripts, or may even show up in log files.
But companies face an extraordinary challenge in solving this particular problem: the continuous deployment pipeline that devops and agile teams crave to streamline is much trickier to automate when authentication credentials must be fully secured. It can absolutely be done, but it will require complete attention and continuous review. Let’s start with how and why these security flaws exist.
It Starts with the Code
Ordinary security issues, such as SQL injection or cross-site scripting (XSS), are two-dimensional and recognizable by nontechnical staff through straightforward testing methods or even the most basic vulnerability scanning tools. Testers can run a regression test suite scripted on Cucumber and then report issues without knowing anything about the contents of automated build scripts that trigger the test from a repo, some of which actually contain secret access keys and tokens. The diverse ways such credentials are used in scripted CI pipelines is limited only by the creativity of the developer. Usually continuous integration and deployment (CI/CD) involves simplifying and automating as much as possible, but scanning the code, the integration, and the deployment process itself is no straightforward task for mere mortals.
Scripted builds must instead be scanned for vulnerabilities, not by humans, but by machines. And they must be sufficiently intelligent to start before the first CI trigger. Jenkins is a popular developer tool that detects when a coder submits (“commits”) a change to an app. Jenkins then triggers a test suite that, if successful, will proceed with an automatic deployment of the successful version of the application. These are some of the steps in CI/CD, which is now a wildly popular way of distributing new software to users. Why are CI and CD the great abyss of security failures?
It’s important that scripted builds be scanned for vulnerabilities, not by humans, but by machines. Scanning the code, the integration, and the deployment process itself is no straightforward task for mere mortals.
To achieve a truly automated development pipeline, a developer must script a sequence of events from code change to app deployment. This means that any change made to the web app triggers a new set of test suites to verify that the code change did not break another part of the app. As you can imagine, to automate a web app test, a virtual user must come into being, login to an account, and do something a normal user would do—perhaps purchase a TV using a new account and address. How can a bot login to an account when we have all the security features previously described, such as dynamic PIN authentication? It’s ofen as simple as pasting cleartext credentials into a file, and that might terrify even the most seasoned systems administrators and devops engineers.
When a developer—oftentimes a QA engineer—scripts an automated CI pipeline, they quite literally enter the login credentials for the virtual users into scripts. These scripts are not compiled or encrypted; they are plaintext scripts executed on both a browser and a server. Automation scripts are usually saved to repositories like Apache Subversion and Git, or in the case of the flippantly naughty, to indexable repositories on wide-open web servers or on public GitHub itself. They are triggered during rollback and release through deployment pipelines. These security vulnerabilities must be detected by a source code scanner before a commit triggers a new build. Only this will prevent hackers from wrapping their eyes around the contents of these scripts. Lax processes on frequently accessed protocols, components, and libraries result in source code vulnerabilities.
Avoid lax processes and ensure that your source code is scanned for security vulnerabilities by a source code scanner. Remember our Uber example? They, like other companies “frequently accidentally keep credentials in source code that is uploaded to GitHub.”
But this is not accidental—sometimes it’s routine, in the name of convenience or speed of development. And the only way to get in front of a bad habit or reckless automation is to automate scanning it up front.
As mentioned, repos and versioning platforms are strictly the domain of coders. Likewise, testers and QA engineers don’t usually think of build scripts as a part of the web app under development. This mind-set must change, but there must also be an improved platform that enforces security.
Irrefutably, the substance of a company is its source code. New automated delivery pipelines now expose a new form of security vulnerability. With the advent of any new technology, a new set of threats arise. Enterprises whose core product is delivered by a web app must recognize source code security as the utmost priority. Static analysis of source code can identify vulnerabilities in both proprietary and open source code before an application is deployed to production or before a build is packaged and delivered. A security-based versioning and repository platform that detects vulnerabilities can then interrupt the automatic deployment of a new version until the issue is corrected or an authoritative team leader authenticates and approves the release. The prevention of security failures like those that led to the catastrophic Equifax and Uber breaches is well worth a momentary lapse in delivery to production. Security-based developer tools must necessarily figure in the next generation of coding standards.
Indeed, a critical security feature also needed in Git repos today is the capability to intelligently scan all source code and identify access keys and passwords added by developers. An estimated 75% of security breaches are enabled when developers code secret access keys and passwords into source code. What’s needed is an automatic system that detects key and password entry in scripts and alerts team leaders for review.
A method of bolted on aferthought security that developers use currently is to separate login credentials into an“Include” file in order to remove passwords for Oauth and other secret keys and tokens from login scripts. But this method depends on the habits of individual developers and defeats the concept of collaborative accountability. We
have already shown that devops practices that encourage speed also inspire coders to shortcut security.
A platform that implements automated vulnerability scanning and analysis is the next generation of standard developer security tools to be included in improved repository and version control platforms. Automated vulnerability
scanning democratizes code security across an entire devops or agile team.
A security-based distributed app development platform must implement policy enforcement automatically. In other words, in the same way that Jenkins detects a code change, a source code security system should detect the entry of sensitive credentials or private data. Such a system will critically control and audit who is authorized to access and update source code. The system will provide protected branches and user reports to team leaders for urgent response. The system should have keyword traps and work like an antivirus program to catch entries of credentials before submission and prior to any commit that triggers a build.
Parallel with safe developer practices is the adoption of enterprise-level standards. The certification of standards compliance is essential to the success of web app development today. A cloud-based distributed software development platform that provides repository and versioning should critically comply with a security service organization control (SOC) 2 audit. Such compliance amounts to additional assurance of robust source code security. Ideally, the security protocols of a security-based developer platform with version control should meet comprehensive audit and certification standards to verify compliance as an important layer of quality assurance and customer privacy protection.
Corequisite with SOC 2 certification are several additional compliance standards to guarantee privacy, and all should be considered essential to verification of source code security and cloud data privacy overall. PCI Level 3 and Privacy Shield7 for privacy practices are most essential in cloud-based e-commerce payment and data security networks.
The Cloud Security Alliance STAR self-assessment is likewise an assurance to all affiliates and customers that a webbased enterprise maintains the highest standards of privacy and source code security.
Data localization is also important in many jurisdictions in that it keeps data nearest to the organization for performance and control. Due to the rise of compliance regulations, specifically GDPR, the EU cloud will be especially beneficial to international customers concerned with data privacy, data protection, and the rise of preferences for data localization.
Packages are collections of code and scripts that programmatically include each other in builds for specific apps. Ultimately this is how a modern web app is constructed; it is an assembly of packages. Packages are often downloaded by customers and integrated into new software products. Such code packages often contain security issues. An enterprise cannot manually scan all the code in every package its developers use, so a significant benefit must ensue from an automatic vulnerability scanning and auditing system. Such a scanner would stand as a sentinel to protect the enterprise from a particular variety of creative but dangerous innovation: the scripting of authentication credentials.
We have just learned that an unsecured Docker image registry exposed the entire source code of Aeroflot’s core web application. As we have seen, such exclusively developer-inhabited realms as containerization now vehemently demand scrutiny to avoid catastrophic data loss.
Developers share posts to get an idea on how to do something in an unfamiliar case, very often looking for shortcuts. One post illustrates an easy way to script a login for a test case with Selenium. Another post admonishes developers for doing just that! It’s a free-for-all zone, one riddled with risk. It’s a frontier nearly without security.
The wild new frontier of automated pipelines of continuous integration delivery evolves rapidly with emphasis on speed and innovation. In such a creative atmosphere, even more emphasis is needed on security in version management and repositories. Security-focused developer platforms now exist that implement all the versioning and repo tools but with this crucially needed security. Features such as automated vulnerability source code scanning must now become standard fare in enterprise app development.
SecureGitTM
Enterprise Software Development Platform
Assembla is the most secure Git solution in the world. Get started in less than a
minute, on-prem or in the cloud.
Hosted on Assembla
14 day free trial. No credit card needed.
Hosted on Your Servers
30 day free trial. No credit card needed.