You’re getting a constant error with a particularly annoying piece of code. What do you do?

You might ask the rest of the team if they know the problem. You might rubber-duck it. You might google the error and see if anyone on Stack Overflow has come across the problem before.

As of a few months ago, a new option tops this list–you ask AI. You put your code into ChatGPT or Bard or Bing and you ask what’s going wrong. The AI spits out the answer and gives you the context for the error and solution.

You copy paste the code and you understand the problem. Ideal ending.

But there’s a problem with this new solution. Your ‘chat’ with ChatGPT is a two-way conversation. You get your code fixed and an explanation. But ChatGPT also gets something–your code. This means that, if you are putting your company’s code into ChatGPT, you are sharing proprietary code. And companies don’t like that.

But if your company is one of the ones that is stopping you accessing AI tools, nil desperandum, there are other tools available that can help you understand and fix your code, without giving away the company’s secrets.

The companies shutting down AI access

Samsung, Amazon, Verizon, Walmart, J.P. Morgan Chase.

All these companies have restricted use of ChatGPT to their employees. These restrictions fall into two buckets:

  1. Guidelines on what AI can be used for, in the case of Amazon, Verizon, and Walmart
  2. Outright bans, such as in the case of Samsung and J.P Morgan Chase

The guidelines are basically along the lines of don’t share sensitive information. An Amazon lawyer said don’t share “any Amazon confidential information (including Amazon code you are working on)” in a Slack message to employees.

The bans come because people have been doing exactly that. 11% of data employees paste into ChatGPT is confidential. At Samsung, three different employees were found to have shared proprietary information with ChatGPT. From Gizmodo:

  • “a Samsung employee copied the source code from a faulty semiconductor database into ChatGPT and asked it to help them find a fix.”
  • “an employee shared confidential code to try and find a fix for defective equipment.”
  • “Another employee reportedly submitted an entire meeting to the chatbot and asked it to create meeting minutes.”

So what is the problem? It is that OpenAI is using the information you send in your prompts to train their models. Interestingly, if you ask ChatGPT if it’s going to store and use your code in any way, it gives a pretty unequivocal no:

Prompt: If I share my code with you, what happens?

ChatGPT: As an AI model developed by OpenAI, I don't store or remember personal data shared with me during our conversation, including any code you share. This is to ensure your privacy and confidentiality.

Prompt: So my code won't be used by you in a future training set?

ChatGPT: Correct, as an AI developed by OpenAI, I don't have the ability to store or remember conversations, including any code you share with me. This means that the code you provide cannot be used in future training sets. OpenAI has implemented strict data usage policies to ensure user privacy and confidentiality. The model is designed in such a way that it doesn't have the capability to access or retrieve personal data from users unless explicitly provided by the user for the purpose of the conversation.

If you got that from ChatGPT, you’d think you were in the clear. No data is stored and “the code you provide cannot be used in future training sets.

But, if you look at the OpenAI help docs, in particular ​​How your data is used to improve model performance, you’d definitely get a different answer:

“We continuously improve our models through research breakthroughs as well as exposure to real-world problems and data. When you share your data with us, it helps our models become more accurate and better at solving your specific problems and it also helps improve their general capabilities and safety… When you use our non-API consumer services ChatGPT or DALL-E, we may use the data you provide us to improve our models.”

You can turn this off in your data settings:

Who’s right? The superintelligent machine or the help docs? The help docs. Likely, when GPT-4 (the model underpinning ChatGPT) was trained, the idea was that data wouldn’t be used (and actually, if you use the API directly, the data isn’t saved) for training purposes. But with the incredible popularity of ChatGPT, OpenAI is using that data to improve models and usability.

Thus, anything you type into ChatGPT could turn up in an answer to someone else’s question. If you are inputting your company’s code (or your company’s financials) into ChatGPT, there’s a chance it will be ‘leaked’ to other users.

Using code visibility tools to help you understand your code

So you can’t use ChatGPT to help understand your codebase anymore. What are your options?

First, let’s take a step back and try to understand the problem you are trying to solve with ChatGPT. There are really three main questions engineers have for ChatGPT or other AI models:

  1. Can you help write my code?
  2. Can you help fix my code?
  3. Can you help me understand this code?

Without AI models, you are back to the old-school ways of answering that first question. Using team members, search, SO, MDN, Reddit, rubber ducks, whatever. Ultimately, you’ll have to go back to typing characters into an IDE.

But that’s OK. Asking AI to code produces one-dimensional code. Why? Because it’s simply missing the higher context of your codebase. Even if you tell it what you are trying to do, it can only give you a generic version of the code you need.

For the second and third questions though, you have another option to help you understand your code, code more effectively, and fix potential issues in your codebase.

These are code visibility tools. Code visibility tools, like CodeSee, are powerful aids for developers seeking to navigate and understand intricate codebases. They utilize advanced static analysis and indexing to dissect code structure and relationships, offering invaluable insights. These tools not only unravel the complexity of large-scale projects, but also foster enhanced understanding and productivity.

CodeSee and AI-powered tools such as ChatGPT or GitHub Copilot serve a common goal of simplifying the process of understanding and navigating complex codebases. Where code visibility tools excel is in allowing engineers to deal with the complexity and context of a large codebase. ChatGPT can help with code snippets, but CodeSee can help with codebases.

So when code visibility is paired with AI in CodeSee’s upcoming AI-Powered Code Understanding platform, the combined tool results in something incredibly effective. This upcoming AI tool will provide smart code recommendations with actionable insights within CodeSee’s core interactive visualizations to help you understand your codebase structure and the flow and dependencies in your code.

For instance, you might ask What third party libraries do we use for auth? You’d then get a list of the authentication libraries within your codebase along with a visualization of where these libraries are called.

This will be excellent for onboarding new engineers. They can quickly get up to speed on the codebase by asking natural language questions and then getting both text and visual guidance back.

With CodeSee, you can create onboarding maps for new engineers to help them understand complex codebases and how different systems interact. The AI will then be able to annotate the code with generated walkthroughs so the new engineer knows what each component of the code does.

Another example might be refactoring code. Asking the AI Show me everything in my codebase that uses the GitHub API? will get you back a map of the codebase showing all the API calls. If you need to refactor for a different service, you now have a base to work from.

CodeSee allows you to visualize all your code and see it in a comprehensive manner, so you can refactor correctly. You can see how code is interacting, what engineers are working on, and what code is stale. Through mapping, you can easily visualize all of this and then start rewriting individual code blocks.

Once you’ve completed the refactoring and need to check in the code, the AI tool will be able to create a code automation and generate a PR for you detailing your changes–an immediate productivity improvement.

AI and code visibility are ideal partners

While AI has revolutionized our approach to problem-solving and understanding code, the current concerns around data privacy and code security can’t be overlooked. As we navigate these challenges, we must not lose sight of all the tools we have at our disposal outside the AI realm, each capable of providing invaluable insights into our codebase.

Code visibility tools are a prime example of these. They facilitate code visualization, making the task of untangling and understanding complex code more manageable. Developers can delve into the heart of their codebase, uncovering insights that can lead to efficient debugging, code optimization, and system improvement - all without compromising the sanctity of proprietary code.

In an ideal world, you could use code visibility to understand your code and your systems, and to provide context for any decisions. You could then use AI to iterate on your code, increase productivity, and dive deep into your codebase to provide insights.

You will be able to do all that with CodeSee’s upcoming AI-Powered Code Understanding platform. Combining AI and code visibility will provide a level of understanding and context you won’t get from each tool alone. Combine them, and you will be able see your team’s productivity skyrocket as you can resolve questions faster, maximize team performance with actionable insights, and effortlessly comprehend legacy codebases–something ChatGPT can never do.

Be sure to sign up for CodeSee’s upcoming AI-Powered Code Understanding platform waitlist to receive updates and receive access when it's ready.