Gentle Introduction to LangGraph: A Step-by-Step Tutorial – ██FR█████ █INTELL███████████

This content originally appeared on Level Up Coding – Medium and was authored by Dr. Varshita Sher

GENERATIVE AI

If you don’t know where to get started, this is for you

I might be a little late to the LangGraph party but it doesn’t matter! As I started reading/hearing/seeing Agentic AI pretty much everywhere in my feed, LangGraph was obviously my first choice for playing around with it, given my previous knowledge of the LangChain framework. (There are others like CrewAI and AutoGen which we will get to eventually.)

While there are a lot of tutorials on building AI agents with LangGraph (think automated vacation planner or PhD research assistant), I wanted to go back one step and cover the basics so anyone can follow those tutorials.

More specifically, how to (intuitively) build the graph, how to pass information between nodes in the graph, how conditional edges work, how to implement custom operators to save information within each node, etc.

GitHub – V-Sher/LangGraphTutorial: Gentle Intro to LangGraph basics

Why should I bother with LangGraph when I know LangChain?

LangGraph is an orchestration framework for complex agentic systems and is more low-level and controllable than LangChain agents. [Source].

Here’s an example to explain in layman's terms. Remember, in one of my previous articles I covered multi-hop questions using react agents from LangChain. These agents work by coming up with a plan (reason-observation-action) loop. Now, in some cases, at runtime, the plan generated by the LLM might be different than what you’d like. This is where LangGraph gives you the flexibility to predefine the exact flow you’d like, which is guaranteed to be followed every single time during inference.

How should I start building the graph?

In my experience, a bottom-up technique is the best and the most intuitive way to write LangGraph code.

Here’s how to approach it: first come up with a graph on paper. This is my rough draft.

Image by Author

I am extracting relevant content i.e. customer remarks from the result payload (think of payload as a dictionary returned by the API that contains things other than the customer_remark like time_of_comment, social_media_channel, number_of_likes), then depending on whether it is a compliment (for example, I really love this toothpaste) or a question (Why has the packaging changed for the toothpaste?), I will trigger different pieces of code — run_compliment_code or run_question_code.

This makes sense at a practical level since we get a lot of comments on our social media and while getting lots of positive feedback is good, we want to filter and quickly respond to questions. Finally, we want to run a beautify node that will beautify the answer before being presented to the end user.

From hereon, we are ready to start writing the code to create the nodes using add_node(). Each circle in your rough draft becomes a node:

from langgraph.graph import StateGraph
graph_builder = StateGraph(State)

graph_builder.add_node("extract_content", extract_content)
graph_builder.add_node("run_question_code", run_question_code)
graph_builder.add_node("run_compliment_code", run_compliment_code)
graph_builder.add_node("beautify", beautify)

Things to keep in mind:

There is no need to create the start and the end nodes. LangGraph has inbuilt ones we will be using later at the time of creating the edges (from langgraph.graph import END, START)
Even though the two arguments for add_node() look identical, they don’t need to be. The first one is the name of the node and the second refers to the name of the python function that contains the logic of what needs to happen at this node.

Next, let’s define the edges using add_edge().

from langgraph.graph import END, START

graph_builder.add_edge(START, "extract_content")
graph_builder.add_edge("run_question_code", "beautify")
graph_builder.add_edge("run_compliment_code", "beautify")
graph_builder.add_edge("beautify", END)

Note: Remember, the two parameters for the add_edge() should be names of the edges, not the names of the Python functions. Be careful of this if you didn’t pick identical names for the two.

And finally, the conditional edge from the extract_content node:

graph_builder.add_conditional_edges(
    "extract_content",
    route_question_or_compliment,
    {
        "compliment": "run_compliment_code",
        "question": "run_question_code",
    },
)

This edge takes three arguments:

(a) the name of the node after which the conditional check will happen, in our case extract_content node.

(b) the function which has the logic for this conditional check, in our case route_question_or_compliment() function (this function will check if we have received a compliment or a question) and this function will return a string, either compliment or question.

(c) a dictionary with N keys (where N is the number of possible outputs from the conditional check function, in our case two) and values as the associated nodes that will be redirected to. (for instance — {“compliment”: “run_compliment_code”, “question”: “run_question_code”}. In my case, I have two dedicated nodes (run_compliment_code and run_question_code) for handling both scenarios, as you can see in the image above.

Next, let’s define the State class. This is where we will store all the variables that will be used across all the nodes. For instance, I know I will need text to store the extracted content from the first node and use it as part of the conditional node to check whether this text is a question or a compliment. Similarly, we will need answer to store the final answer from the last beautify node.

from typing_extensions import TypedDict

class State(TypedDict):
    text: str
    answer: str
    payload: dict[str, list]

Next, for each node, let’s define the associated python functions:

extract_content node

def extract_content(state: State):
    return {"text": state["payload"][0]["customer_remark"]}

What this function will do is update the text variable in State with the appropriate content i.e. customer_remark. But where does it find the customer_remark. It is inside the payload which can be accessed using state[“payload”].
Note: We will be passing the payload at the time of calling the graph as a dictionary such as graph.invoke({“payload”: [{“time_of_comment”: “20–01–2025”, …..}]

Also, as a rule of thumb:

The function associated with each node will usually update one or more class variables defined (after all, we want them to be made available for other nodes). The way to do so is by the function return-ing one of the predefined variables, in our case text.
At any node, if you want to access any variable, all you need to do is look inside state, for example state[“text”] or state[“payload”].

2. route_question_or_compliment node

In the interest of simplicity, we will be tagging a piece of text as a question if it has a question mark at the end, otherwise, it’s a compliment.

def route_question_or_compliment(state: State):
    if "?" in state["text"]:
        return "question"
    else:
        return "compliment"

Note: We can, of course, make it a little more complicated by letting an LLM decide if it’s a compliment or a question.

llm = AzureChatOpenAI(
    deployment_name="gpt-4o",
    model_name="gpt-4o",
    azure_endpoint=AZURE_OPENAI_ENDPOINT,
    openai_api_version=OPENAI_DEPLOYMENT_VERSION,
    openai_api_key=OPENAI_API_KEY,
    openai_api_type="azure",
    temperature=0.1,
)

template = """
I have a piece of text: {text}. 
Tell me whether it is a 'compliment' or a 'question'. 
"""
prompt = ChatPromptTemplate([("user", template)])
chain = prompt | llm | StrOutputParser()

def route_question_or_compliment(state: State):
    response = chain.invoke({"text": state["text"]})
    return response

3. run_compliment_code() node & run_question_code() node

def run_compliment_code(state: State):
    return {"answer": "Thanks for the compliment."}

def run_question_code(state: State):
    return {"answer": "Wow nice question."}

Note: Even though both these functions receive the state as input, we don’t need to use it. We can simply update the answer variable.

4. beautify() node

For simplicity, I am just adding the word “beautified” at the end. However, feel free to use another llm call here.

def beautify(state: State):
    return {"answer": [state["answer"] + "beautified"]}

Things to note here:

we will be extracting the answer from the state (i.e. state[“answer”]) and then updating the same variable again (return {“answer”: ….}). In other words, we will be overwriting the previous contents of that variable. (We will see later how to append to it instead of overwriting)

Let’s visualize our graph first

graph = graph_builder.compile()

from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))

Image by Author

Note: The dotted edges from extract_content to the other two nodes indicate that only one of those two edges will be executed at run time. Solid edges indicate that they would always be run.

Let’s run our first example

If you remember, we set answer as a class variable to store the final response. Let’s see its values (and that of all the other variables) using invoke.

graph.invoke({
        "payload": [
            {
                "time_of_comment": "20-01-2025",
                "customer_remark": "I hate this.",
                "social_media_channel": "facebook",
                "number_of_likes": 100,
            }
        ]
    })

**** OUTPUT ****

{'text': 'I hate this.',
 'answer': ['Thanks for the compliment.beautified'],
 'payload': [{'time_of_comment': '20-01-2025',
   'customer_remark': 'I hate this.',
   'social_media_channel': 'facebook',
   'number_of_likes': 100}]}

You will notice that since we don’t store the response from run_compliment_node() in a separate variable, the beautify() node overwrites the content of answer that we got from the run_compliment_node(). Let’s see how we can keep adding to the answer variable, instead of overwriting it.

P.S. If you need the step-by-step response at each node, you can use stream(). This is useful if you want to send updates to the front end to show a progress bar.

for step in graph.stream({"payload": [{"time_of_comment": "20-01-2025",
                            "customer_remark": "I hate this.", 
                             "social_media_channel": "facebook",
                              "number_of_likes":100
                            }]
}):
    print(step)


**** OUTPUT *****
{'extract_content': {'text': 'I hate this.'}}
{'run_compliment_code': {'answer': 'Thanks for the compliment.'}}
{'beautify': {'answer': ['Thanks for the compliment.beautified']}}

How to append content to variables that are already in the state?

We will start by switching the type of answer from string to a list. The way you do it in LangGraph is slightly fancy: Annotated[list, operator.add].

The first argument is our usual data type i.e. list and the second argument operator.add tells that we need to keep adding to this list.

FYI — operator.add works with lists (Annotated[list, operator.add]), strings (Annotated[str, operator.add]) and numbers (Annotated[int, operator.add]).

class State(TypedDict):
    text: str
    # answer: str ## old way
    answer: Annotated[list,operator.add] ## new way
    payload: dict[str, list]

Accordingly, we will change the way we define the node functions such that they are now returning a list instead of a string.

def run_compliment_code(state: State):
    return {"answer": ["Thanks for the compliment."]} ## now returns a list


def run_question_code(state: State):
    return {"answer": ["Wow nice question."]} ## now returns a list

Since, answer list will now contain more than one item, we will fetch only the last item (state[“answer”][-1]) in beautify node.

def beautify(state: State):
    return {
          "answer": [str(state["answer"][-1]) ## fetch last item from state[answer] list
                + "beautified"
    ]}

Keeping everything else the same, let’s run it.

graph2.invoke(
    {
        "payload": [
            {
                "time_of_comment": "20-01-2025",
                "customer_remark": "I hate this.",
                "social_media_channel": "facebook",
                "number_of_likes": 100,
            }
        ]
    }
)

*** OUTPUT ****
{'text': 'I hate this.',
 'answer': ['Thanks for the compliment.',
  'Thanks for the compliment.beautified'],
 'payload': [{'time_of_comment': '20-01-2025',
   'customer_remark': 'I hate this.',
   'social_media_channel': 'facebook',
   'number_of_likes': 100}]}

As you can see in the output, answer list contains the intermediate answer i.e. the output from the run_compliment_node() as well as the final beautified output from the beautify() node: ‘answer’: [‘Thanks for the compliment.’, ‘Thanks for the compliment.beautified’]

How to implement custom operators?

P.S operator.adddoesn’t work with dictionaries. Here’s what operator.add actually works with:

Lists: [1, 2] + [3, 4] = [1, 2, 3, 4]
Numbers: 1 + 2 = 3
Strings: "hello" + "world" = "helloworld"

But for dictionaries, operator.add will ALWAYS raise a TypeError. To fix this, we need to define a custom operator.

Let’s modify our code such that answer is no longer of type list but a dictionary Annotated[dict, merge_dicts].

class State(TypedDict):
    text: str
    # answer: Annotated[list, operator.add] ## old way
    answer: Annotated[dict, merge_dicts] ## new way
    payload: dict[str, list]

As you can see we have replaced the standard operator.add with our new function merge_dicts. This function will include the logic for how to combine the content across two dictionaries. In our case, we have a simple implementation. This is how it looks:

def merge_dicts(dict1, dict2):
    return {**dict1, **dict2}

Next, we also need to update the outputs returned by the run_compliment_code() and run_question_code() nodes such that it is a dictionary (I am naming the key as “temp_answer”).

def run_compliment_code(state: State):
    return {"answer": {"temp_answer": "Thanks for the compliment."}}


def run_question_code(state: State):
    return {"answer": {"temp_answer": "Wow nice question."}}

Similarly, beautify node will also need to return a dictionary now (I am naming the key as final_beautified_answer:

def beautify(state: State):
    print(state)
    return {
        "answer": {
            "final_beautified_answer": [
                str(state["answer"]["temp_answer"]) + "beautified"
            ]
        }
    }

Lastly, we are ready to run the code one more time and observe the output

graph3.invoke(
    {
        "payload": [
            {
                "time_of_comment": "20-01-2025",
                "customer_remark": "I hate this.",
                "social_media_channel": "facebook",
                "number_of_likes": 100,
            }
        ]
    }
)

*** OUTPUT ****
{'text': 'I hate this.',
 'answer': {'temp_answer': 'Thanks for the compliment.',
  'final_beautified_answer': ['Thanks for the compliment.beautified']},
 'payload': [{'time_of_comment': '20-01-2025',
   'customer_remark': 'I hate this.',
   'social_media_channel': 'facebook',
   'number_of_likes': 100}]}

If you look at the answer variable in the output, it is not quite different from last time. Previously, answer was a list and contained the intermediate and the final beautified response. Now, it is a dictionary and contains the same things.

While the custom operator here is quite simple, the whole point of showing you how to create them was that if ever the dictionary structure is quite complex (for instance key is a string but value is a dictionary of lists of dictionaries), you should know how to merge them (we will look at one such example in Part 2).

How to create branches for parallel node execution?

What if we wanted to also tag the type of the customer remark (packaging, sustainability, medical) whilst it is deciding whether the customer remark is a question or a compliment? And later use that tag in the beautify part of the code?

Back to the drawing board:

Image by Author

In this flow chart, nodes "tag_query" and one of the two nodes (either "run_question_code" and “run_compliment_code”) are executed concurrently in the same superstep. Because they are in the same step, the node "beautify answer" executes after both of them are finished.

Let’s update the code to reflect these changes (as mentioned previously, I prefer a bottom-up approach so I will (a) add the nodes and edges first, (b) then write the function for any newly created node, and finally (c ) I will see what values this function is returning and accordingly update the state variables. As an additional step (d), I will update the beautify code to make use of this new variable.

(a) update edges and nodes

graph_builder.add_node("tag_query", tag_query)
graph_builder.add_edge("tag_query", "beautify")

(a) write function for the new node — tag_query.

def tag_query(state: State):
    if "package" in state["text"]:
        return {"tag": "Packaging"}
    elif "price" in state["text"]:
        return {"tag": "Pricing"}
    else:
        return {"tag": "General"}

(c ) update the State class to include a new variable tag so it is accessible by the ‘beautify’ node

class State(TypedDict):
    text: str
    tag: str ## newly added
    answer: Annotated[dict, merge_dicts]
    payload: dict[str, list]

This is what the final graph looks like:

Image by Author

Note: The dotted edges and solid edges serve purposes as described at the start of this article.

(d) update beautify node by using the tag to specify which department to forward the query

def beautify(state: State):
    return {
        "answer": {"final_beautified_answer": 
          [
           str(state["answer"]["temp_answer"])
         + f'I will pass it to the {state["tag"]} Department' ## newly added
            ]
        }
    }

Finally, time to run the code

graph4.invoke(
    {
        "payload": [
            {
                "time_of_comment": "20-01-2025",
                "customer_remark": "I hate this.",
                "social_media_channel": "facebook",
                "number_of_likes": 100,
            }
        ]
    }
)

*** OUTPUT ****
{'text': 'I hate this.',
 'tag': 'General',
 'answer': {'temp_answer': 'Thanks for the compliment.',
  'final_beautified_answer': ['Thanks for the compliment.beautified.I will pass it to the General Department']},
 'payload': [{'time_of_comment': '20-01-2025',
   'customer_remark': 'I hate this.',
   'social_media_channel': 'facebook',
   'number_of_likes': 100}]}

If you notice the final_beautified_answer you’ll see the updated answer that passes the query to the right department ’final_beautified_answer’: [‘Thanks for the compliment.I will pass it to the General Department’].

While I appreciate this is an overly simplified way to showcase parallel code execution, this will come in handy when you want to call the LLM in an async manner where you don’t need to wait for the output from the first LLM call to run the second LLM call. For instance, node #1 is summarizing some docs and node #2 is writing those docs to a database.

Final Thoughts

Speaking of oversimplified, it is finally time to use all that we have learned today in creating a RAG-based workflow in Part 2. This will read the user question, fetch relevant docs from the vector database, check if we have enough docs to answer the question, and if so, the apply map-reduce chain over the docs and finally return the answer to the user question after a bit of beautification.

In Part 2, we will also learn how to use LangGraph’s Send API. It is mostly used in conjunction with conditional edges and is crucial for implementing map-reduce in LangGraph since it allows us to send different States to different nodes in the workflow.

Gentle Introduction to LangGraph: A Step-by-Step Tutorial was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

This content originally appeared on Level Up Coding – Medium and was authored by Dr. Varshita Sher