AI Software Engineering is here. This isn’t code completion. This isn’t a code generation framework. This is code generation. With innovative tools like ChatGPT and GitHub Copilot, AI-generated code transforms how we create software, enabling faster development for professionals and hobbyists alike. However, as we increasingly rely on these cutting-edge technologies to produce accurate and efficient code, it's crucial to remain vigilant about the potential risks.
In a way, AI-generated code does not differ from other code pulled from various sources. The difference is the availability and ease at which this version of the code generation can be consumed will be significantly different. Some of the security implications include:
a. Inaccurate and insecure algorithms can result from using poor-quality training data. If AI models are trained on data containing security flaws or biases, the generated code may inherit these vulnerabilities, leading to potentially insecure implementations.
b. AI-generated code may not undergo sufficient security scanning. Automated code generation could lead to less scrutiny, increasing the likelihood of introducing security flaws that remain undetected in the final product. Currently, many open-source tools or sample codes are scanned by security tools.
c. Outdated and insecure third-party libraries can be leveraged or recommended by AI-generated code. These dependencies may introduce vulnerabilities if they are not maintained, patched, or properly configured, putting the security of the entire application at risk.
d. Malicious code can result from model attacks on the AI system. Threat actors may manipulate the input data or the model itself to generate code with intentionally introduced vulnerabilities, posing a significant security risk.
e. Inexperienced developers and smaller teams may not be adequately prepared to address potential security concerns. Without proper knowledge and expertise, they might overlook crucial security aspects, leading to vulnerabilities in the AI-generated code.
It is important to restate that some of these security concerns exist already. However, the ability for someone who has never coded before to generate, compile, and deploy code in a few days will be significantly easier and likelier.
Current Best Practices
A complete software security assurance program includes:
Threat Modeling: Review architecture to identify where things can go wrong and offer remediation or mitigation guidance. It is a continuing process that doesn’t stop once coding starts.
SAST/IAST/DAST: These tools are used to identify security vulnerabilities and weaknesses. Static Application Security Testing inspects the source code. Interactive Application Security Testing monitors the running application. Dynamic Application Security Testing executes tests against the running application.
SBOM/SCA: The Software Bill of Materials provides a list of all third-party libraries, license information, and known vulnerabilities. Software Composition Analysis focuses on known vulnerabilities.
Vulnerability Management: Continuously identifying, evaluating, and remediating software vulnerabilities.
Developer Training: Guiding developers via point-in-time, hackathons, or third-party secure code training courses to inform developers of specific security concerns for different frameworks and architecture designs.
Best Practices Today, Not For Tomorrow
AI-generated code will change how software security is managed - from vulnerability management to threat modeling to software bill of materials - impacting everything.
Threat Modeling: Without having insight into the third-party AI tools or training data, one of the common threats will be “how good is the code we generate?” and “how trustworthy is it?”.
As AI code generation moves from “generate this function” to “generate this microservice” to “generate this product”, the threat model will become more of a challenge to understand. There is a black box that will exist. Does a threat model need to include the training data question? Like any other threat model, it must be updated and revised to keep up with the changing landscape.
SAST/IAST/DAST: Depending on how code is generated, existing tools may not be capable of even executing against the code. Each tool is driven by a knowledge of the architecture, languages, and frameworks. Without that knowledge, the tools become ineffective. It is not hard to imagine an AI generating code in three or four different languages (or generating its own framework or grammar) to increase scalability. And even if the languages are in use, it’s quite possible that the language used is not supported by the scanning technology an organization currently has in place.
SBOM/SCA: What qualifies as a SBOM in the AI world? Is the training data something that needs to be factored into the SBOM? Is the information about the AI configuration something that should be factored in? There’s the broader licensing question in general from the AI world. Some of these questions aren’t much different than the current world, but the speed at which they are impacted can be significant. Maintaining a current SBoM could be quite challenging.
Management: The expectation is that AI code generation will change the pace of software development significantly. This means that vulnerability management will have to adjust to some degree. There’s a chance that this might be a positive. Though, the odds are more likely to push for more features to get pushed out rather than taking the time gained to fix security-related issues. That will be purely dependent on product management, though.
Developer Training: AI provides responses based on the source code provided to the model. As with most sample code, this is typically not done with security in mind. There is an excellent chance the AI-generated code is not secure. There’s also a good chance it is poorly written. Will it get better over time? The odds are that it will. As training data improves, it will improve. How it gets that training data is the question. And like StackOverflow today, developers should not just trust the data out of the box. Developers should be cautious about consuming AI responses in their products.
There is going to be less likelihood of identifying incorrect responses and being able to reach out to a team to address those inaccuracies. That is troubling!
Where Do We Go From Here?
Application Security practitioners are probably already aware that code generators exist. Even complex ones (see: JHipster). They have been around for a while. Most help with boilerplate code. But some even write code. That said, there hasn’t been anything like what we have seen recently. Many demos show how AI has been used to write video games, business applications, and screen grabs. All were created from scratch and made in hours vs. what used to take days or weeks.
This code is probably generated with security in mind (anecdotal data I've seen confirms this), and some of the code has just been wrong). At the same time, it will be consumed much more rapidly.
Security professionals must be aware that existing processes will need to be adjusted, and existing tools might not be sufficient for handling the expected velocity of changes that will be enabled and the technology changes that might be introduced. It is highly possible existing tools won’t be able to scale as needed. What does this mean for the industry as a whole?
Lastly, the application security industry can do something that it has not done enough of in the past. Build examples of secure code deployments to improve the training data of these AI code generators.
This gives AppSec VARs like True Positives a real advantage in the future. We are NOT beholden to any application security vendor, technology, or approach. We aim to integrate as many vendors and technologies into our partner model and use telemetry data to deliver the best results for our customers. This cornerstone belief will guide us well into the future for application security.