Building a GPT agent that actually understands KAREL - possible?: Hardwares and Devices

Navigation

RSS Feeds

Articles Downloads Forums News Web Links

Member Polls

Building a GPT agent that actually understands KAREL - possible?

Last updated on 2 months ago

robotics

Track thread Print

KevinVeteran Member

Posted 2 months ago

Posted by fanuc_programmer_dan

I spend way too much time debugging KAREL code and I'm wondering if anyone has built a custom GPT or LLM agent that can actually help with this. The problem is KAREL is such a niche language that generic ChatGPT is basically useless, it hallucinates syntax and gives you code that won't even compile. I'm thinking about training or fine-tuning something on our internal KAREL codebase but I have no idea if that's realistic or even worth the effort. We have probably 200+ KAREL programs, mostly socket communication, motion control, and custom I/O handling. Has anyone done something similar for obscure industrial programming languages? Or am I better off just getting better at reading compiler errors lol.

KevinVeteran Member

Posted 2 months ago

Reply by ai_robotics_dev_michelle

Dan I actually built something like this a few months ago for our team and it's been surprisingly helpful. The key is you don't necessarily need to fine-tune, you can get pretty far with RAG (retrieval augmented generation) where you feed relevant KAREL examples and documentation to the LLM as context. I scraped all the official Fanuc KAREL manuals, our internal code repos, and even old forum posts, then embedded everything into a vector database. When someone asks a question, it pulls the most relevant examples and the LLM uses those to give much better answers. Here's the basic setup I used:

Code Download source

from langchain.embeddings import OpenAIEmbeddings

from langchain.vectorstores import FAISS

from langchain.chat_models import ChatOpenAI

from langchain.chains import RetrievalQA

import os



# Load and embed KAREL documentation and code samples

def setup_karel_kb():

 embeddings = OpenAIEmbeddings()

 

 # Load your KAREL code and docs

 documents = load_karel_files('./karel_codebase/')

 

 # Create vector store

 vectorstore = FAISS.from_documents(documents, embeddings)

 

 # Setup QA chain

 llm = ChatOpenAI(model="gpt-4", temperature=0)

 qa_chain = RetrievalQA.from_chain_type(

 llm=llm,

 retriever=vectorstore.as_retriever(search_kwargs={"k": 5})

 )

 

 return qa_chain



# Query example

qa = setup_karel_kb()

response = qa.run("How do I properly close a socket connection in KAREL?")

It won't write perfect code but it's way better than asking vanilla ChatGPT. The retrieved examples give it actual syntax to work from.

KevinVeteran Member

Posted 2 months ago

Reply by fanuc_programmer_dan

Michelle that's really clever, using RAG instead of fine-tuning makes way more sense given how little KAREL code exists in the world. How did you handle the fact that KAREL has different versions with slightly different syntax? We have some R-30iB controllers and some newer R-30iB Plus and the KAREL implementations are subtly different. Also curious how you deal with compiler-specific errors - like when you get "TRAN-160 illegal variable declaration" or whatever and need to figure out what you actually did wrong. Can the LLM help with that or is it just guessing?

KevinVeteran Member

Posted 2 months ago

Reply by industrial_controls_sam

For compiler errors I found you need to give the LLM examples of common errors and their solutions as part of the context. I created an "error database" where I documented every KAREL compiler error we've encountered over the years with the actual fix. Something like:

Code Download source

error_examples = """

ERROR: TRAN-160 illegal variable declaration

CAUSE: Variable declared inside a routine instead of at program level

FIX: Move VAR declarations to top of program before any routines



ERROR: TRAN-089 undefined routine

CAUSE: Routine called before it's defined, or typo in routine name 

FIX: Ensure routine is defined before being called, check spelling



ERROR: EXEC-315 Stack overflow

CAUSE: Recursive routine calls or too many nested routine calls

FIX: Reduce nesting depth or increase stack size in program attributes

Then when someone pastes a compiler error the LLM searches this database first before trying to reason about it. Way more reliable than letting it guess. Also helps to include the actual KAREL code that's failing so it can see the context.

KevinVeteran Member

Posted 2 months ago

Reply by robotics_integrator_jeff

This is all interesting but I'm curious about the practical value. Like are you guys actually using this for production code or just learning/exploration? I've been burned too many times by AI-generated code that looks right but has subtle bugs. With KAREL especially, if you screw up motion commands or I/O timing you can crash a million dollar robot into something. Not trying to be negative just want to understand the risk management here. Do you have the LLM generate code and then you review it line by line, or are you trusting it more than that?

KevinVeteran Member

Posted 2 months ago

Reply by ai_robotics_dev_michelle

Jeff that's a totally valid concern. We absolutely do not trust it to generate production code unsupervised. The way we use it is more like an interactive documentation system - you ask it how to do something and it shows you examples from our existing codebase that work. Then you adapt those examples yourself. It's also really helpful for debugging where you paste your code and error message and it suggests what might be wrong based on similar issues we've seen before. Here's an example prompt structure that works well:

Code Download source

debug_prompt = f"""

You are a KAREL programming assistant with access to verified code examples.



TASK: Help debug this KAREL code

COMPILER ERROR: {error_message}

CODE:

{user_karel_code}



INSTRUCTIONS:

1. Search for similar error patterns in the knowledge base

2. Identify the likely cause based on verified examples

3. Suggest specific fix with reference to working code

4. DO NOT generate new code from scratch

5. If unsure, say "I need more context" rather than guessing



Retrieved examples:

{retrieved_similar_code}

By constraining it to reference existing working code rather than generating new stuff, we reduce the hallucination problem significantly.

KevinVeteran Member

Posted 2 months ago

Reply by fanuc_programmer_dan

OK that makes sense, using it as a smart search tool rather than a code generator. I like the idea of the error database, we definitely have enough historical issues to populate that. Michelle how much effort was it to get all your KAREL code into a format the LLM could understand? A lot of our programs have binary dependencies or include files that reference hardware-specific configurations. Did you just strip all that out or try to preserve the full context?

KevinVeteran Member

Posted 2 months ago

Reply by ai_robotics_dev_michelle

Good question - I did strip out some stuff but tried to preserve as much context as possible. For include files I expanded them inline so the LLM could see the full picture. For hardware-specific stuff like I/O assignments I added comments explaining what the magic numbers meant. The preprocessing script looked something like:

Code Download source

def process_karel_file(filepath):

 with open(filepath, 'r') as f:

 content = f.read()

 

 # Expand includes

 content = expand_includes(content)

 

 # Add context comments for hardware references

 content = annotate_io_references(content)

 

 # Extract metadata

 metadata = {

 'filename': filepath,

 'robot_model': extract_robot_model(content),

 'controller_version': extract_controller_version(content),

 'purpose': extract_program_purpose(content)

 }

 

 return {'content': content, 'metadata': metadata}

The metadata is super important because then you can filter results based on robot model or controller version. Like if someone asks about socket communication on an R-30iB Plus you don't want to show them examples from an R-30iA that use different syntax.

KevinVeteran Member

Posted 2 months ago

Reply by automation_engineer_lisa

This thread has me thinking about doing something similar for Siemens SCL code which is equally obscure and poorly documented. One concern though - if you're embedding your company's proprietary KAREL code into OpenAI's systems aren't you potentially leaking IP? Or are you running local models? We're in automotive and our lawyers would have a fit if we sent our robot programs to external APIs. How are you handling the data privacy aspect?

KevinVeteran Member

Posted 2 months ago

#10

Reply by ai_robotics_dev_michelle

Lisa great point. We're using Azure OpenAI which has data residency guarantees and doesn't train on customer data, plus we're in a region with GDPR compliance. That said, we still scrubbed anything truly proprietary before embedding - removed customer names, specific part numbers, trade secret processes, etc. Just kept the generic programming patterns and common debugging scenarios. For companies with stricter requirements you could use a local model like CodeLlama or Mistral, they're not as good but they work offline. There's also the option of using GitHub Copilot for Business which keeps your data isolated. Really depends on your risk tolerance and budget.

KevinVeteran Member

Posted 2 months ago

#11

Reply by fanuc_programmer_dan

Alright I'm sold, going to try building a basic version of this with RAG and see how it goes. Thanks for all the detailed examples Michelle, super helpful. One last thing - have any of you tried using these LLM agents for translating between KAREL and other languages? Like we have some old robots running KAREL and newer ones with Python on embedded PCs, and sometimes we need to port functionality between them. Wondering if an LLM could help with that translation or if the semantic gap is too big.

You can view all discussion threads in this forum.
You cannot start a new discussion thread in this forum.
You cannot reply in this discussion thread.
You cannot start on a poll in this forum.
You cannot upload attachments in this forum.
You cannot download attachments in this forum.

Users Online Now

Guests Online 5
Members Online 0

Total Members: 27
Newest Member: Howardzit