<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[I am Zahiruddin Tavargere (Zahere). A social-learner, here to learn, share and grow with the tech community.]]></title><description><![CDATA[I am Zahiruddin Tavargere (Zahere). A firm believer in social learning, I owe my dev career to all the tech content creators I have learned from - this is my contribution back to the community.]]></description><link>https://zahere.com</link><generator>RSS for Node</generator><lastBuildDate>Sun, 12 Apr 2026 16:40:00 GMT</lastBuildDate><atom:link href="https://zahere.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[📡 FastAPI MCP SSE Server with JWT Auth & Custom Client]]></title><description><![CDATA[📖 Introduction
In modern AI applications, communication between clients and tools isn’t always as simple as calling an API. The Model Context Protocol (MCP) provides a standardized way for clients to exchange information, invoke tools, and maintain ...]]></description><link>https://zahere.com/fastapi-mcp-sse-server-with-jwt-auth-and-custom-client</link><guid isPermaLink="true">https://zahere.com/fastapi-mcp-sse-server-with-jwt-auth-and-custom-client</guid><category><![CDATA[mcp-auth]]></category><category><![CDATA[mcp]]></category><category><![CDATA[mcp server]]></category><category><![CDATA[MCP Client]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Sun, 18 May 2025 19:36:02 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1747596901479/2551ab79-b5e4-420a-b86e-0b7ed39f8cdb.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">📖 Introduction</h2>
<p>In modern AI applications, communication between clients and tools isn’t always as simple as calling an API. The <strong>Model Context Protocol (MCP)</strong> provides a standardized way for clients to exchange information, invoke tools, and maintain shared context over persistent connections — typically via <strong>Server-Sent Events (SSE)</strong>.</p>
<p>In this post, I’ll walk you through:</p>
<ul>
<li><p>Building an MCP SSE server using <strong>FastAPI</strong></p>
</li>
<li><p>Securing it with <strong>JWT authentication</strong></p>
</li>
<li><p>Implementing a <strong>custom Python client</strong> to connect, authenticate, and use MCP tools<br />  We’ll also build a simple <code>BMI Calculator</code> tool to demo tool calling through MCP.</p>
</li>
</ul>
<h2 id="heading-project-overview">🎛️ Project Overview</h2>
<p>We’ll build:</p>
<ul>
<li><p>A <strong>FastAPI server</strong> that exposes an MCP-compliant SSE endpoint</p>
</li>
<li><p>A <strong>token-based auth system</strong></p>
</li>
<li><p>Simple tools to get weather and time for a given location</p>
</li>
<li><p>A <strong>Python client</strong> that authenticates, connects via SSE, and invokes the tool dynamically</p>
</li>
</ul>
<hr />
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/D-m5J4rTGN8">https://youtu.be/D-m5J4rTGN8</a></div>
<p> </p>
<hr />
<h2 id="heading-tech-stack">📦 Tech Stack</h2>
<ul>
<li><p>Python 3.11+</p>
</li>
<li><p>Python MCP SDK</p>
</li>
<li><p>FastAPI</p>
</li>
<li><p>aiohttp</p>
</li>
<li><p>PyJWT</p>
</li>
<li><p>Pydantic</p>
</li>
<li><p>loguru (for clean logs)</p>
</li>
</ul>
<h2 id="heading-setting-up-the-mcp-sse-server-serverpy">Setting up the MCP SSE Server (server.py)</h2>
<p>Let’s first import the relevant libraries and get credentials and keys from env variables</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> datetime
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> zoneinfo <span class="hljs-keyword">import</span> ZoneInfo
<span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span>  FastAPI, HTTPException, Request
<span class="hljs-keyword">from</span> pydantic <span class="hljs-keyword">import</span> BaseModel
<span class="hljs-keyword">import</span> requests
<span class="hljs-keyword">from</span> starlette.applications <span class="hljs-keyword">import</span> Starlette
<span class="hljs-keyword">from</span> starlette.routing <span class="hljs-keyword">import</span> Route, Mount
<span class="hljs-keyword">import</span> jwt
<span class="hljs-keyword">from</span> mcp.server.fastmcp <span class="hljs-keyword">import</span> FastMCP
<span class="hljs-keyword">from</span> mcp.server.sse <span class="hljs-keyword">import</span> SseServerTransport
<span class="hljs-keyword">from</span> loguru <span class="hljs-keyword">import</span> logger

<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv

load_dotenv()
</code></pre>
<p>Let’s initialize the MCP server and setup the tools</p>
<pre><code class="lang-python"><span class="hljs-comment"># Initialize the MCP server with your tools</span>
mcp = FastMCP(
    name=<span class="hljs-string">"Weather and Time SSE Server"</span>
)


<span class="hljs-meta">@mcp.tool()</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">TimeTool</span>(<span class="hljs-params">input_timezone</span>):</span>
    <span class="hljs-string">"Provides the current time for a given city's timezone like Asia/Kolkata, America/New_York etc. If no timezone is provided, it returns the local time."</span>
    format = <span class="hljs-string">"%Y-%m-%d %H:%M:%S %Z%z"</span>
    current_time = datetime.datetime.now()    
    <span class="hljs-keyword">if</span> input_timezone:
        print(<span class="hljs-string">"TimeZone"</span>, input_timezone)
        current_time =  current_time.astimezone(ZoneInfo(input_timezone))
    <span class="hljs-keyword">return</span> <span class="hljs-string">f"The current time is <span class="hljs-subst">{current_time}</span>."</span>

transport = SseServerTransport(<span class="hljs-string">"/messages/"</span>)


<span class="hljs-meta">@mcp.tool()</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">weather_tool</span>(<span class="hljs-params">location: str</span>):</span>
    <span class="hljs-string">"""Provides weather information for a given location"""</span>        
    api_key = os.getenv(<span class="hljs-string">"OPENWEATHERMAP_API_KEY"</span>)
    url = <span class="hljs-string">f"http://api.openweathermap.org/data/2.5/weather?q=<span class="hljs-subst">{location}</span>&amp;appid=<span class="hljs-subst">{api_key}</span>&amp;units=metric"</span>
    response = requests.get(url)
    data = response.json()
    <span class="hljs-keyword">if</span> data[<span class="hljs-string">"cod"</span>] == <span class="hljs-number">200</span>:
        temp = data[<span class="hljs-string">"main"</span>][<span class="hljs-string">"temp"</span>]
        description = data[<span class="hljs-string">"weather"</span>][<span class="hljs-number">0</span>][<span class="hljs-string">"description"</span>]
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"The weather in <span class="hljs-subst">{location}</span> is currently <span class="hljs-subst">{description}</span> with a temperature of <span class="hljs-subst">{temp}</span>°C."</span>
    <span class="hljs-keyword">else</span>:
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"Sorry, I couldn't find weather information for <span class="hljs-subst">{location}</span>."</span>
</code></pre>
<p>We will now setup the JWT auth system and use the starlette routes to essentially expose the app to the clients.  </p>
<pre><code class="lang-python">
SECRET_KEY = <span class="hljs-string">"my_super_secret_key"</span>
ALGORITHM = <span class="hljs-string">"HS256"</span>     
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">check_auth</span>(<span class="hljs-params">request: Request</span>):</span>
    auth = request.headers.get(<span class="hljs-string">"authorization"</span>, <span class="hljs-string">""</span>)        
    <span class="hljs-keyword">if</span> auth.startswith(<span class="hljs-string">"Bearer "</span>):
        token = auth.split(<span class="hljs-string">" "</span>, <span class="hljs-number">1</span>)[<span class="hljs-number">1</span>]
        <span class="hljs-keyword">try</span>:
            payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
            <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>
        <span class="hljs-keyword">except</span> jwt.ExpiredSignatureError:
            <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Token expired"</span>)
        <span class="hljs-keyword">except</span> jwt.InvalidTokenError:
            <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Invalid token"</span>)

    <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Unauthorized"</span>)

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handle_sse</span>(<span class="hljs-params">request</span>):</span>
    check_auth(request=request)
    <span class="hljs-comment"># Prepare bidirectional streams over SSE</span>
    <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> transport.connect_sse(
        request.scope,
        request.receive,
        request._send
    ) <span class="hljs-keyword">as</span> (in_stream, out_stream):
        <span class="hljs-comment"># Run the MCP server: read JSON-RPC from in_stream, write replies to out_stream</span>
        <span class="hljs-keyword">await</span> mcp._mcp_server.run(
            in_stream,
            out_stream,
            mcp._mcp_server.create_initialization_options()
        )


<span class="hljs-comment">#Build a small Starlette app for the two MCP endpoints</span>
sse_app = Starlette(
    routes=[
        Route(<span class="hljs-string">"/sse"</span>, handle_sse, methods=[<span class="hljs-string">"GET"</span>]),
        <span class="hljs-comment"># Note the trailing slash to avoid 307 redirects</span>
        Mount(<span class="hljs-string">"/messages/"</span>, app=transport.handle_post_message)
    ]
)
</code></pre>
<p>Now let’s setup a FastAPI app and add a route for the clients to create a token. We will create a mock client store to simulate credential manager.</p>
<pre><code class="lang-python">app = FastAPI()

<span class="hljs-comment"># Mock client store</span>
CLIENTS = {
    <span class="hljs-string">"test_client"</span>: <span class="hljs-string">"secret_1234"</span>
}


<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">TokenRequest</span>(<span class="hljs-params">BaseModel</span>):</span>
    client_id: str
    client_secret: str


<span class="hljs-meta">@app.post("/token")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_token</span>(<span class="hljs-params">request: TokenRequest</span>):</span>
    <span class="hljs-keyword">if</span> request.client_id <span class="hljs-keyword">in</span> CLIENTS <span class="hljs-keyword">and</span> CLIENTS[request.client_id] == request.client_secret:
        payload = {
            <span class="hljs-string">"sub"</span>: request.client_id,
            <span class="hljs-string">"exp"</span>: datetime.datetime.now() + datetime.timedelta(minutes=<span class="hljs-number">60</span>)
        }
        token = jwt.encode(payload, SECRET_KEY, algorithm=ALGORITHM)
        <span class="hljs-keyword">return</span> {<span class="hljs-string">"access_token"</span>: token}
    <span class="hljs-keyword">else</span>:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Invalid credentials"</span>)
</code></pre>
<p>Mount the starlette app to FastAPI and setup gunicorn server to run the app.</p>
<pre><code class="lang-python">app.mount(<span class="hljs-string">"/"</span>, sse_app)

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    <span class="hljs-keyword">import</span> uvicorn
    uvicorn.run(app, host=<span class="hljs-string">"0.0.0.0"</span>, port=<span class="hljs-number">8100</span>)
</code></pre>
<h2 id="heading-why-jwt-based-auth">🔒 Why JWT-Based Auth?</h2>
<p>Using <strong>client_id</strong> / <strong>client_secret</strong> via POST is scalable:</p>
<ul>
<li><p>You can rotate credentials</p>
</li>
<li><p>Enforce expiration</p>
</li>
<li><p>Integrate external OAuth2/OIDC providers</p>
</li>
<li><p>Track session-level auth</p>
</li>
</ul>
<h2 id="heading-setting-up-the-mcp-sse-client-clientpy">Setting up the MCP SSE Client (client.py)</h2>
<p>Setting up client is similar to what we have seen previously in this series. The only change being we generate the token and pass that as headers in the SSE Client.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> asyncio
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Optional
<span class="hljs-keyword">from</span> mcp <span class="hljs-keyword">import</span> ClientSession
<span class="hljs-keyword">from</span> mcp.client.sse <span class="hljs-keyword">import</span> sse_client
<span class="hljs-keyword">from</span> openai <span class="hljs-keyword">import</span> OpenAI
<span class="hljs-keyword">import</span> mcp.client.sse <span class="hljs-keyword">as</span> _sse_mod
<span class="hljs-keyword">from</span> httpx <span class="hljs-keyword">import</span> AsyncClient <span class="hljs-keyword">as</span> _BaseAsyncClient
<span class="hljs-keyword">from</span> loguru <span class="hljs-keyword">import</span> logger
<span class="hljs-keyword">import</span> aiohttp

<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv

load_dotenv()  <span class="hljs-comment"># load environment variables from .env</span>

<span class="hljs-keyword">import</span> httpx
_orig_request = httpx.AsyncClient.request

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_patched_request</span>(<span class="hljs-params">self, method, url, *args, **kwargs</span>):</span>
    <span class="hljs-comment"># ensure follow_redirects is set so 307 → /messages/ works</span>
    kwargs.setdefault(<span class="hljs-string">"follow_redirects"</span>, <span class="hljs-literal">True</span>)
    <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> _orig_request(self, method, url, *args, **kwargs)

httpx.AsyncClient.request = _patched_request
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">llm_client</span>(<span class="hljs-params">message: str</span>):</span>
    client = OpenAI()

    completion = client.chat.completions.create(
        model=<span class="hljs-string">"gpt-4o-mini"</span>,
        messages=[
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"You are an intelligent Assistant. You will execute tasks as instructed"</span>},
            {
                <span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,
                <span class="hljs-string">"content"</span>: message,
            },
        ],
    )

    result = completion.choices[<span class="hljs-number">0</span>].message.content
    <span class="hljs-keyword">return</span> result



<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_prompt_to_identify_tool_and_arguements</span>(<span class="hljs-params">query, tools</span>):</span>
    tools_description = <span class="hljs-string">"\n"</span>.join([<span class="hljs-string">f"<span class="hljs-subst">{tool.name}</span>: <span class="hljs-subst">{tool.description}</span>, <span class="hljs-subst">{tool.inputSchema}</span>"</span> <span class="hljs-keyword">for</span> tool <span class="hljs-keyword">in</span> tools.tools])
    <span class="hljs-keyword">return</span>  (<span class="hljs-string">"You are a helpful assistant with access to these tools:\n\n"</span>
                <span class="hljs-string">f"<span class="hljs-subst">{tools_description}</span>\n"</span>
                <span class="hljs-string">"Choose the appropriate tool based on the user's question. \n"</span>
                <span class="hljs-string">f"User's Question: <span class="hljs-subst">{query}</span>\n"</span>                
                <span class="hljs-string">"If no tool is needed, reply directly.\n\n"</span>
                <span class="hljs-string">"IMPORTANT: When you need to use a tool, you must ONLY respond with "</span>                
                <span class="hljs-string">"the exact JSON object format below, nothing else:\n"</span>
                <span class="hljs-string">"Keep the values in str "</span>
                <span class="hljs-string">"{\n"</span>
                <span class="hljs-string">'    "tool": "tool-name",\n'</span>
                <span class="hljs-string">'    "arguments": {\n'</span>
                <span class="hljs-string">'        "argument-name": "value"\n'</span>
                <span class="hljs-string">"    }\n"</span>
                <span class="hljs-string">"}\n\n"</span>)




TOKEN_URL = <span class="hljs-string">"http://localhost:8100/token"</span>
SSE_URL = <span class="hljs-string">"http://localhost:8100/sse"</span>

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_token</span>():</span>
    payload = {<span class="hljs-string">"client_id"</span>: <span class="hljs-string">"test_client"</span>, <span class="hljs-string">"client_secret"</span>: <span class="hljs-string">"secret_1234"</span>}
    <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> aiohttp.ClientSession() <span class="hljs-keyword">as</span> session:
        <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> session.post(TOKEN_URL, json=payload) <span class="hljs-keyword">as</span> resp:
            <span class="hljs-keyword">if</span> resp.status != <span class="hljs-number">200</span>:
                logger.error(<span class="hljs-string">f"Failed to get token: <span class="hljs-subst">{resp.status}</span>"</span>)
                <span class="hljs-keyword">raise</span> Exception(<span class="hljs-string">"Unable to authenticate. Ensure you are using valid credentials"</span>)
            data = <span class="hljs-keyword">await</span> resp.json()
            logger.info(<span class="hljs-string">"Successfully generated token"</span>)
            <span class="hljs-keyword">return</span> data[<span class="hljs-string">"access_token"</span>]

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">main</span>(<span class="hljs-params">query:str</span>):</span>        
    <span class="hljs-keyword">try</span>:
        auth_token = <span class="hljs-keyword">await</span> get_token()    
        headers = {<span class="hljs-string">"Authorization"</span>: <span class="hljs-string">f"Bearer <span class="hljs-subst">{auth_token}</span>"</span>}
        <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> sse_client(url=SSE_URL,headers=headers) <span class="hljs-keyword">as</span> (in_stream, out_stream):
            <span class="hljs-comment"># 2) Create an MCP session over those streams</span>
            <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> ClientSession(in_stream, out_stream) <span class="hljs-keyword">as</span> session:
                <span class="hljs-comment"># 3) Initialize</span>
                info = <span class="hljs-keyword">await</span> session.initialize()
                logger.info(<span class="hljs-string">f"Connected to <span class="hljs-subst">{info.serverInfo.name}</span> v<span class="hljs-subst">{info.serverInfo.version}</span>"</span>)

                <span class="hljs-comment"># 4) List tools</span>
                tools = (<span class="hljs-keyword">await</span> session.list_tools())
                logger.info(tools)            

                prompt = get_prompt_to_identify_tool_and_arguements(query,tools)
                logger.info(<span class="hljs-string">f"Printing Prompt \n <span class="hljs-subst">{prompt}</span>"</span>)

                response = llm_client(prompt)
                print(response)

                tool_call = json.loads(response)

                result = <span class="hljs-keyword">await</span> session.call_tool(tool_call[<span class="hljs-string">"tool"</span>], arguments=tool_call[<span class="hljs-string">"arguments"</span>])
                logger.success(<span class="hljs-string">f"User query: <span class="hljs-subst">{query}</span>, Tool Response: <span class="hljs-subst">{result.content[<span class="hljs-number">0</span>].text}</span>"</span>)
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        print(<span class="hljs-string">f"Encountered error: <span class="hljs-subst">{e}</span>"</span>)



<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:

    queries = [<span class="hljs-string">"What is the time in Bengaluru?"</span>, <span class="hljs-string">"What is the weather like right now in Dubai?"</span>]
    <span class="hljs-keyword">for</span> query <span class="hljs-keyword">in</span> queries:
        asyncio.run(main(query))
</code></pre>
<p>Response when the client is run.</p>
<pre><code class="lang-python"><span class="hljs-number">2025</span><span class="hljs-number">-05</span><span class="hljs-number">-18</span> <span class="hljs-number">18</span>:<span class="hljs-number">00</span>:<span class="hljs-number">48.230</span> | SUCCESS  | __main__:main:<span class="hljs-number">103</span> - User query: What <span class="hljs-keyword">is</span> the weather like right now <span class="hljs-keyword">in</span> Dubai?, Tool Response: The weather <span class="hljs-keyword">in</span> Dubai <span class="hljs-keyword">is</span> currently clear sky <span class="hljs-keyword">with</span> a temperature of <span class="hljs-number">30.37</span>°C.
</code></pre>
<hr />
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://github.com/zahere-dev/mcp-labs">https://github.com/zahere-dev/mcp-labs</a></div>
<p> </p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=0KJ2oBRtUbs&amp;list=PLcV6wf9EnlujtRRE2cDeM8v69jAV2cRs9">https://www.youtube.com/watch?v=0KJ2oBRtUbs&amp;list=PLcV6wf9EnlujtRRE2cDeM8v69jAV2cRs9</a></div>
]]></content:encoded></item><item><title><![CDATA[Build an MCP Client and Server from Scratch Using Python]]></title><description><![CDATA[If you’re curious about how to build an intelligent agent using Model Context Protocol (MCP), you’re in the right place.
In this post, I’ll walk you through how to:

Create an MCP Server using FastMCP

Expose a tool that calculates BMI

Build a Clien...]]></description><link>https://zahere.com/build-an-mcp-client-and-server-from-scratch-using-python</link><guid isPermaLink="true">https://zahere.com/build-an-mcp-client-and-server-from-scratch-using-python</guid><category><![CDATA[mcp tutorial]]></category><category><![CDATA[mcp]]></category><category><![CDATA[MCP Client]]></category><category><![CDATA[mcp server]]></category><category><![CDATA[Model Context Protocol]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Mon, 07 Apr 2025 04:19:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1743999386558/e8b06b92-389c-4926-a3eb-325039afaac2.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you’re curious about how to build an intelligent agent using <strong>Model Context Protocol (MCP)</strong>, you’re in the right place.</p>
<p>In this post, I’ll walk you through how to:</p>
<ul>
<li><p>Create an <strong>MCP Server</strong> using FastMCP</p>
</li>
<li><p>Expose a tool that calculates <strong>BMI</strong></p>
</li>
<li><p>Build a Client that communicates with this server via <strong>stdio</strong></p>
</li>
<li><p>Use <strong>OpenAI’s GPT model</strong> to decide which tool to call and how to call it</p>
</li>
</ul>
<p>Let’s break this down line by line — code and concept.</p>
<hr />
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=hMHYhRcd_Uo">https://www.youtube.com/watch?v=hMHYhRcd_Uo</a></div>
<p> </p>
<hr />
<h2 id="heading-what-is-mcp">What is MCP?</h2>
<p>Before diving into code, let’s understand what MCP is.</p>
<p>MCP (Model Context Protocol) is an <strong>open protocol</strong> developed by <a target="_blank" href="https://www.anthropic.com/">Anthropic</a> to <strong>standardize how LLMs interact with tools</strong>. Think of it as the <strong>USB-C of AI apps</strong> — a universal way to connect and interact with tools, APIs, and services without writing tons of custom glue code.</p>
<p>Here’s a simple analogy:</p>
<ul>
<li><p><strong>USB-C Port</strong>: One port to rule them all — display, power, storage.</p>
</li>
<li><p><strong>MCP</strong>: One protocol to access tools — calculators, search, databases, or any custom service.</p>
</li>
</ul>
<p>By adopting MCP, we avoid the pain of writing one-off integration code for every LLM interaction. Instead, we define tools once and let the LLM figure out how to use them.</p>
<p><a target="_blank" href="https://norahsakal.com/blog/mcp-vs-api-model-context-protocol-explained/"><img src="https://norahsakal.com/assets/images/mcp_overview-641a298352ff835488af36be3d8eee52.png" alt="What is MCP?" class="image--center mx-auto" /></a></p>
<p><em>Courtesy:</em> <a target="_blank" href="https://norahsakal.com/blog/mcp-vs-api-model-context-protocol-explained/"><em>https://norahsakal.com</em></a></p>
<hr />
<h2 id="heading-project-overview">🛠️ Project Overview</h2>
<p>We’ll build two things:</p>
<ol>
<li><p><strong>An MCP Server</strong> that exposes a simple tool to calculate BMI.</p>
</li>
<li><p><strong>An MCP Client</strong> that communicates with the server via an LLM (like OpenAI GPT) and invokes the tool.</p>
</li>
</ol>
<p>Let’s get started.</p>
<h2 id="heading-mcp-server">MCP Server</h2>
<p>There are just 2 dependencies we need to install</p>
<pre><code class="lang-python">pip install <span class="hljs-string">"mcp[cli]"</span>
pip install openai
</code></pre>
<p>Let’s create a file called <code>bmi_</code><a target="_blank" href="http://server.py"><code>server.py</code></a>.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> mcp.server.fastmcp <span class="hljs-keyword">import</span> FastMCP

mcp = FastMCP(<span class="hljs-string">"BMI Server"</span>)

print(<span class="hljs-string">f"Starting server <span class="hljs-subst">{mcp.name}</span>"</span>)

<span class="hljs-meta">@mcp.tool()</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calculate_bmi</span>(<span class="hljs-params">weight_kg:float, height_m:float</span>) -&gt; float:</span>
    <span class="hljs-string">"""
    Calculate BMI given weight in kg and height in meters.
    """</span>
    <span class="hljs-keyword">if</span> height_m &lt;= <span class="hljs-number">0</span>:
        <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"Height must be greater than zero."</span>)
    <span class="hljs-keyword">return</span> weight_kg / (height_m ** <span class="hljs-number">2</span>)


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    mcp.run(transport=<span class="hljs-string">"stdio"</span>)
</code></pre>
<p><strong>What’s happening here?</strong></p>
<ul>
<li><p>We use the FastMCP class to create the MCP server</p>
</li>
<li><p>We create a BMI tool using the <code>@mcp.tool()</code>decorator.</p>
</li>
<li><p>It takes weight and height and returns BMI.</p>
</li>
<li><p>The server exposes this tool using standard input/output transport.</p>
</li>
<li><p>We want this file to be run independently and not as a module</p>
</li>
</ul>
<p>Obviously, this is a fundamental MCP server, but this should get us started on building our own MCP client.</p>
<h2 id="heading-mcp-client">MCP Client</h2>
<p>Now, let’s create the client in a file named <code>bmi_</code><a target="_blank" href="http://client.py"><code>client.py</code></a>.</p>
<p>Import all the dependencies. The key ones here being ClientSession, StdioServerParameters and stdio_client from the mcp package and OpenAI class to communicate with the LLM.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> asyncio
<span class="hljs-keyword">from</span> openai <span class="hljs-keyword">import</span> OpenAI
<span class="hljs-keyword">from</span> mcp <span class="hljs-keyword">import</span> ClientSession, StdioServerParameters
<span class="hljs-keyword">from</span> mcp.client.stdio <span class="hljs-keyword">import</span> stdio_client
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> json
</code></pre>
<p>We need to now establish a way to communicate with the server we just wrote. Let’s use the StdioServerParameters class from the mcp package to do that.</p>
<pre><code class="lang-python">server_params = StdioServerParameters(command=<span class="hljs-string">"python"</span>, args=[<span class="hljs-string">"bmi-server.py"</span>])
</code></pre>
<p>Essentially, we are telling the client what command to run and how to use the server file.</p>
<p>Let’s write a generic method that to communicate with an LLM API. This should simply take a prompt and return a response.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">llm_client</span>(<span class="hljs-params">message:str</span>):</span>
    <span class="hljs-string">"""
    Send a message to the LLM and return the response.
    """</span>
    <span class="hljs-comment"># Initialize the OpenAI client</span>
    openai_client = OpenAI(api_key=os.getenv(<span class="hljs-string">"OPENAI_API_KEY"</span>))

    <span class="hljs-comment"># Send the message to the LLM</span>
    response = openai_client.chat.completions.create(
        model=<span class="hljs-string">"gpt-4o-mini"</span>,
        messages=[{<span class="hljs-string">"role"</span>:<span class="hljs-string">"system"</span>,
                    <span class="hljs-string">"content"</span>:<span class="hljs-string">"You are an intelligent assistant. You will execute tasks as prompted"</span>,
                    <span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: message}],
        max_tokens=<span class="hljs-number">250</span>,
        temperature=<span class="hljs-number">0.2</span>
    )

    <span class="hljs-comment"># Extract and return the response content</span>
    <span class="hljs-keyword">return</span> response.choices[<span class="hljs-number">0</span>].message.content.strip()
</code></pre>
<p>We need to write a prompt that does 2 things  </p>
<p>1. Share the context with the LLM about the tools it has at its disposal<br />2. Instruct a structured output so that we can execute the tool the LLM selected</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_prompt_to_identify_tool_and_arguments</span>(<span class="hljs-params">query,tools</span>):</span>
    tools_description = <span class="hljs-string">"\n"</span>.join([<span class="hljs-string">f"- <span class="hljs-subst">{tool.name}</span>, <span class="hljs-subst">{tool.description}</span>, <span class="hljs-subst">{tool.inputSchema}</span> "</span> <span class="hljs-keyword">for</span> tool <span class="hljs-keyword">in</span> tools])
    <span class="hljs-keyword">return</span>  (<span class="hljs-string">"You are a helpful assistant with access to these tools:\n\n"</span>
                <span class="hljs-string">f"<span class="hljs-subst">{tools_description}</span>\n"</span>
                <span class="hljs-string">"Choose the appropriate tool based on the user's question. \n"</span>
                <span class="hljs-string">f"User's Question: <span class="hljs-subst">{query}</span>\n"</span>                
                <span class="hljs-string">"If no tool is needed, reply directly.\n\n"</span>
                <span class="hljs-string">"IMPORTANT: When you need to use a tool, you must ONLY respond with "</span>                
                <span class="hljs-string">"the exact JSON object format below, nothing else:\n"</span>
                <span class="hljs-string">"Keep the values in str "</span>
                <span class="hljs-string">"{\n"</span>
                <span class="hljs-string">'    "tool": "tool-name",\n'</span>
                <span class="hljs-string">'    "arguments": {\n'</span>
                <span class="hljs-string">'        "argument-name": "value"\n'</span>
                <span class="hljs-string">"    }\n"</span>
                <span class="hljs-string">"}\n\n"</span>)
</code></pre>
<p>We are passing two arguments - the original query from the user and the list of tools from the server(s). Let’s now see how get the list of tools from the server.</p>
<pre><code class="lang-python">
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">run</span>(<span class="hljs-params">query: str</span>):</span>
    <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> stdio_client(server_params) <span class="hljs-keyword">as</span> (read, write):
        <span class="hljs-keyword">async</span> <span class="hljs-keyword">with</span> ClientSession(read,write) <span class="hljs-keyword">as</span> session:

            <span class="hljs-keyword">await</span> session.initialize()

            <span class="hljs-comment"># Get the list of available tools</span>
            tools = <span class="hljs-keyword">await</span> session.list_tools()

            print(<span class="hljs-string">f"Available tools: <span class="hljs-subst">{tools}</span>"</span>)

            prompt = get_prompt_to_identify_tool_and_arguments(query,tools.tools)

            llm_response = llm_client(prompt)
            print(<span class="hljs-string">f"LLM Response: <span class="hljs-subst">{llm_response}</span>"</span>)

            tool_call = json.loads(llm_response)

            result = <span class="hljs-keyword">await</span> session.call_tool(tool_call[<span class="hljs-string">"tool"</span>], arguments=tool_call[<span class="hljs-string">"arguments"</span>])

            print(<span class="hljs-string">f"BMI for weight <span class="hljs-subst">{tool_call[<span class="hljs-string">"arguments"</span>][<span class="hljs-string">"weight_kg"</span>]}</span>kg and height <span class="hljs-subst">{tool_call[<span class="hljs-string">"arguments"</span>][<span class="hljs-string">"height_m"</span>]}</span>m is <span class="hljs-subst">{result.content[<span class="hljs-number">0</span>].text}</span>"</span>)
</code></pre>
<p><strong>Key things happening here:</strong></p>
<ul>
<li><p>We start the BMI Server process using <code>stdio_client</code>, and get access to its read and write streams</p>
</li>
<li><p>Next, we create a session and initialize it. This sets up the communication between our client and the server.</p>
</li>
<li><p>We then ask the server for a list of all available tools. In our case, it will return the <code>calculate_bmi</code> function we exposed earlier.</p>
</li>
<li><p>“We prepare a detailed prompt using the earlier function, passing in the user's query and available tools. This prompt helps the language model figure out what tool to call and what arguments to pass.</p>
</li>
<li><p>We parse the JSON response, extract the tool name and arguments, and then call the tool using <a target="_blank" href="http://session.call"><code>session.call_tool</code></a>.</p>
</li>
<li><p>Finally, we return the output from the tool call — in this case, the calculated BMI.</p>
</li>
</ul>
<p>To make all of the above work, we will add an entry point. When this script is run, it sets the user query — here we ask to calculate BMI — and runs the <code>run()</code> function using <code>asyncio</code>. The final result is printed on screen!</p>
<pre><code class="lang-python"><span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    <span class="hljs-keyword">import</span> asyncio
    query = <span class="hljs-string">"Calculate BMI for height 5ft 10inches and weight 80kg"</span>
    print(<span class="hljs-string">f"Sending query: <span class="hljs-subst">{query}</span>"</span>)
    result = asyncio.run(run(query))
    print(<span class="hljs-string">f"Result: <span class="hljs-subst">{result}</span>"</span>)
</code></pre>
<hr />
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://github.com/zahere-dev/mcp-labs">https://github.com/zahere-dev/mcp-labs</a></div>
<p> </p>
<hr />
<p>And that’s how you build an intelligent MCP client that can call tools dynamically using OpenAI and a BMI server. The magic lies in combining tool discovery, LLM prompting, and tool invocation — all within a simple and elegant flow.</p>
]]></content:encoded></item><item><title><![CDATA[My Favorite OpenAI Agents SDK Feature (And The Most Understated!)]]></title><description><![CDATA[In our previous tutorial, we built a restaurant customer support chatbot using OpenAI's Agents SDK. In this follow-up, we’ll explore guardrails—a critical feature that enhances AI chatbot safety and reliability.
What Are Guardrails in AI Agents?
Guar...]]></description><link>https://zahere.com/my-favorite-openai-agents-sdk-feature-and-the-most-understated</link><guid isPermaLink="true">https://zahere.com/my-favorite-openai-agents-sdk-feature-and-the-most-understated</guid><category><![CDATA[agentic AI]]></category><category><![CDATA[agents]]></category><category><![CDATA[agentic workflow]]></category><category><![CDATA[openai]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Mon, 24 Mar 2025 02:04:41 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1742781696025/6f9fc9fc-94a4-462d-b54a-896fbadb34ba.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In our <a target="_blank" href="https://newsletter.adaptiveengineer.com/p/building-a-multi-agent-system-with">previous tutorial, we built a restaurant customer support chatbot</a> using OpenAI's Agents SDK. In this follow-up, we’ll explore <strong>guardrails</strong>—a critical feature that enhances AI chatbot safety and reliability.</p>
<h3 id="heading-what-are-guardrails-in-ai-agents">What Are Guardrails in AI Agents?</h3>
<p>Guardrails act as a <strong>safety net</strong> for AI agents, ensuring they operate within predefined boundaries and preventing misuse.</p>
<p>They work alongside agents, validating user inputs and outputs to safeguard against errors and inappropriate responses.</p>
<p>There are two types of guardrails:</p>
<ul>
<li><p><strong>Input Guardrails</strong>: Validate user inputs before processing.</p>
</li>
<li><p><strong>Output Guardrails</strong>: Ensure the final response is appropriate before delivering it to the user.</p>
</li>
</ul>
<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe4d90-98ff-4bab-845c-3606f61490e9_2665x1120.png" alt /></p>
<p>Let’s see them in action!</p>
<h2 id="heading-input-guardrails">Input Guardrails</h2>
<ul>
<li><p>These validate initial user inputs before passing them to expensive models.</p>
</li>
<li><p>They operate in three steps: receiving input, running validation functions, and triggering errors if misuse is detected.</p>
</li>
</ul>
<p>Input guardrails are mechanisms put in place to validate, sanitize, and preprocess user inputs before they reach an AI model. These safeguards help in preventing:</p>
<ul>
<li><p>Malicious injections (e.g., prompt injection attacks)</p>
</li>
<li><p>Profanity, hate speech, and harmful language</p>
</li>
<li><p>Unstructured or irrelevant input that reduces model efficiency</p>
</li>
<li><p>Bias amplification</p>
</li>
</ul>
<p>By implementing input guardrails, developers can ensure that AI models receive well-structured and appropriate input, leading to better and safer outputs.</p>
<h3 id="heading-why-are-input-guardrails-important"><strong>Why Are Input Guardrails Important?</strong></h3>
<ol>
<li><p><strong>Security</strong>: Prevents prompt injections, SQL injections, and adversarial attacks.</p>
</li>
<li><p><strong>Quality Assurance</strong>: Filters out irrelevant or poorly structured queries.</p>
</li>
<li><p><strong>Bias Mitigation</strong>: Helps remove explicit bias in prompts.</p>
</li>
<li><p><strong>User Experience</strong>: Ensures clear and understandable input for meaningful responses.</p>
</li>
<li><p><strong>Compliance</strong>: Adheres to ethical AI principles and regulatory requirements.</p>
</li>
</ol>
<h3 id="heading-how-to-implement-input-guardrails"><strong>How to Implement Input Guardrails</strong></h3>
<p>Create the guardrail agent as below. Use the @input_guardrail decorator for the guardrail method.</p>
<p>More here in the <a target="_blank" href="https://openai.github.io/openai-agents-python/guardrails/">documentation</a>.</p>
<p>Input guardrails run in 3 steps:</p>
<ol>
<li><p>First, the guardrail receives the same input passed to the agent.</p>
</li>
<li><p>Next, the guardrail function runs to produce a <code>GuardrailFunctionOutput</code>, which is then wrapped in an <code>InputGuardrailResult</code></p>
</li>
<li><p>Finally, we check if <code>.tripwire_triggered</code> is true. If true, an <code>InputGuardrailTripwireTriggered</code> exception is raised, so you can appropriately respond to the user or handle the exception.</p>
</li>
</ol>
<hr />
<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43849546-663f-4fd3-9c96-a6b5ee456070_1687x1115.png" alt /></p>
<p>Pass the guardrail as an argument to the triage agent.</p>
<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F065268de-2a29-4489-98cf-6622ce6f4821_1222x307.png" alt /></p>
<p>Handle the <code>InputGuardrailTripwireTriggered exception.</code></p>
<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd304c0ec-edd5-4bf6-b425-c66c83cecf1f_1897x1122.png" alt /></p>
<p>Exception raised when the input is “Why is my order delayed? You guys are pathetic“</p>
<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f088748-229c-4d11-ab0c-14cc4fc6b076_3140x150.png" alt /></p>
<h2 id="heading-output-guardrails">Output Guardrails</h2>
<ul>
<li><p>These validate the final outputs generated by agents before they are delivered to users.</p>
</li>
<li><p>They operate similarly to input guardrails but focus on the output stage to ensure accuracy and safety.</p>
</li>
</ul>
<p>Output guardrails run in 3 steps:</p>
<ol>
<li><p>First, the guardrail receives the same input passed to the agent.</p>
</li>
<li><p>Next, the guardrail function runs to produce a <code>GuardrailFunctionOutput</code>, which is then wrapped in an <code>OutputGuardrailResult</code></p>
</li>
<li><p>Finally, we check if <code>.tripwire_triggered</code> is true. If true, an <code>OutputGuardrailTripwireTriggered</code> exception is raised, so you can appropriately respond to the user or handle the exception.</p>
</li>
</ol>
<p>In the example below - we want to check for the “card” in the response of the order_agent and raise an exception accordingly.</p>
<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb97ddf2a-3011-4d0b-9ded-8c254ee07e0f_1575x855.png" alt /></p>
<p>Add the output guardrail as the argument to the final agent in the workflow.</p>
<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4910aa32-8630-4118-a845-1e03d332b916_2905x325.png" alt /></p>
<p>We simulated the response of the order_agent for order 12346 to contain the word “card” and this is how the exception is caught.</p>
<p><img src="https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff197c0bb-2593-4893-b0bc-02c4dfc5fb75_1780x80.png" alt /></p>
<h2 id="heading-code">Code</h2>
<p><a target="_blank" href="https://github.com/zahere-dev/openai-agents-sdk-tutorial">https://github.com/zahere-dev/openai-agents-sdk-tutorial</a></p>
<h2 id="heading-conclusion"><strong>Conclusion:</strong></h2>
<ul>
<li><p>Guardrails are vital components of AI agent systems, ensuring they operate safely and efficiently.</p>
</li>
<li><p>By implementing guardrails, developers can enhance user trust and prevent misuse scenarios effectively.</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[How Uber Saved 140,000 Hours Monthly Using Generative AI Agents]]></title><description><![CDATA[Video
https://www.youtube.com/watch?v=UPBMkFSJdBI
 

The Problem at Hand

Uber's data platform processes approximately 1.2 million interactive queries monthly, with 36% of these coming from the operations organization. This group—comprising engineers...]]></description><link>https://zahere.com/how-uber-saved-140000-hours-monthly-using-generative-ai-agents</link><guid isPermaLink="true">https://zahere.com/how-uber-saved-140000-hours-monthly-using-generative-ai-agents</guid><category><![CDATA[txt to sql]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[#agent]]></category><category><![CDATA[uber]]></category><category><![CDATA[agents]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Tue, 14 Jan 2025 07:16:05 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1736838883241/13fbf6b9-21bc-460d-8359-639a7c278d78.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-video">Video</h2>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=UPBMkFSJdBI">https://www.youtube.com/watch?v=UPBMkFSJdBI</a></div>
<p> </p>
<hr />
<h2 id="heading-the-problem-at-hand"><strong>The Problem at Hand</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736837914622/fa80d083-47e5-4411-920f-601a0ef22bae.png" alt class="image--center mx-auto" /></p>
<p>Uber's data platform processes approximately <strong>1.2 million interactive queries monthly</strong>, with 36% of these coming from the operations organization. This group—comprising engineers, data scientists, and operations professionals—analyzes data from hundreds of thousands of tables across various domains to derive actionable insights.</p>
<p>However, the process of composing and executing queries was a bottleneck:</p>
<ul>
<li><p><strong>10 minutes per query</strong>: Each actor spent an average of 10 minutes composing a query.</p>
</li>
<li><p><strong>Inefficiency Loop</strong>: Users would sift through datasets, run queries, and validate results in a repetitive cycle.</p>
</li>
<li><p><strong>Wasted Time</strong>: The cumulative effect of this inefficiency led to significant lost productivity.</p>
</li>
</ul>
<p>This challenge is not unique to Uber. It resonates across industries, from e-commerce to customer support, where operations teams grapple with similar inefficiencies.</p>
<hr />
<h2 id="heading-enter-querygpt-the-hackathon-solution"><strong>Enter QueryGPT: The Hackathon Solution</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736837946748/89c8877f-203e-431d-a8c3-7c47b4dc0d14.png" alt class="image--center mx-auto" /></p>
<p>In 2023, a team at Uber's hackathon introduced <strong>QueryGPT</strong>, a prototype designed to streamline the query-generation process. Here's how it worked:</p>
<ol>
<li><p><strong>Metadata-Driven Query Generation</strong>:</p>
<ul>
<li><p>Stored 20 SQL queries with metadata (table and schema information).</p>
</li>
<li><p>Mapped natural language prompts to these queries.</p>
</li>
</ul>
</li>
<li><p><strong>Few-Shot Prompting</strong>:</p>
<ul>
<li><p>Used a Retrieval-Augmented Generation (RAG) technique to fetch relevant queries.</p>
</li>
<li><p>Generated SQL queries in response to user prompts.</p>
</li>
</ul>
</li>
<li><p><strong>Initial Results</strong>:</p>
<ul>
<li><p>Reduced query composition time from 10 minutes to 3 minutes.</p>
</li>
<li><p>Achieved an <strong>18% productivity gain</strong>.</p>
</li>
</ul>
</li>
</ol>
<p>While this was a promising start, the prototype faced scalability and technical challenges, necessitating further iterations.</p>
<hr />
<h2 id="heading-challenges-and-iterative-solutions"><strong>Challenges and Iterative Solutions</strong></h2>
<h3 id="heading-key-challenges"><strong>Key Challenges</strong></h3>
<ol>
<li><p><strong>Prompt-to-Schema Mismatch</strong>:</p>
<ul>
<li>The system struggled to align user prompts with relevant schemas.</li>
</ul>
</li>
<li><p><strong>Token Limitations</strong>:</p>
<ul>
<li>Some schemas had over 200 columns, leading to token counts exceeding GPT-4's 32k limit.</li>
</ul>
</li>
</ol>
<hr />
<h2 id="heading-the-final-architecture"><strong>The Final Architecture</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736838008565/1d10b83a-a170-48ac-b1bf-db9ce48e8fcd.png" alt class="image--center mx-auto" /></p>
<p>The refined system, powered by <strong>Azure OpenAI and GPT-4</strong>, demonstrated remarkable efficiency:</p>
<ul>
<li><p><strong>Context Optimization</strong>: Leveraged a context window of 128k tokens to handle large schemas.</p>
</li>
<li><p><strong>Human Validation</strong>: Ensured precision through user acknowledgment of suggested tables.</p>
</li>
<li><p><strong>Scalable Design</strong>: Addressed the challenge of querying across hundreds of thousands of datasets.</p>
</li>
</ul>
<p>Uber's engineering team implemented a robust architecture combining <strong>SQL, RAG, agents, and custom configurations</strong>. Here's a breakdown:</p>
<ol>
<li><p><strong>Domain-Specific Curation</strong>:</p>
<ul>
<li><p>Decomposed datasets into <strong>business domains/workflows</strong> (e.g., mobility, trips, support).</p>
</li>
<li><p>Allowed the system to focus on smaller, relevant subsets of data.</p>
</li>
</ul>
</li>
<li><p><strong>Intent Agent</strong>:</p>
<ul>
<li><p>Classified user prompts to map them to the appropriate domain or workspace.</p>
</li>
<li><p>Likely employed a vector-store-based intent classifier for high accuracy.</p>
</li>
</ul>
</li>
<li><p><strong>Table Agent</strong>:</p>
<ul>
<li><p>Retrieved domain-specific tables and displayed them in a user-friendly interface.</p>
</li>
<li><p>Enabled human-in-the-loop validation to ensure table relevance.</p>
</li>
</ul>
</li>
<li><p><strong>Enhanced RAG Pipeline</strong>:</p>
<ul>
<li><p>Generated few-shot prompts tailored to the specific domain.</p>
</li>
<li><p>Sent refined prompts to GPT-4 for SQL query generation.</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-real-world-impact"><strong>Real-World Impact</strong></h3>
<p>By the 20th iteration, Uber's Query GPT achieved a staggering <strong>140,000 hours saved monthly</strong> across its operations organization. This success underscores the value of combining AI, domain-specific curation, and user-centric design.</p>
<hr />
<h2 id="heading-key-takeaways-for-your-business"><strong>Key Takeaways for Your Business</strong></h2>
<p>Uber's solution offers valuable insights for tackling similar challenges in other industries:</p>
<ol>
<li><p><strong>Break Down Data Silos</strong>:</p>
<ul>
<li>Organize datasets by business domains to streamline data retrieval.</li>
</ul>
</li>
<li><p><strong>Implement Intent Detection</strong>:</p>
<ul>
<li>Use AI-driven agents to map user queries to relevant datasets or workspaces.</li>
</ul>
</li>
<li><p><strong>Leverage Human-in-the-Loop Systems</strong>:</p>
<ul>
<li>Involve users in validating AI-generated outputs for enhanced accuracy.</li>
</ul>
</li>
<li><p><strong>Iterate for Scalability</strong>:</p>
<ul>
<li>Start small, learn from challenges, and scale iteratively.</li>
</ul>
</li>
</ol>
<hr />
<h2 id="heading-the-future-of-ai-in-operations"><strong>The Future of AI in Operations</strong></h2>
<p>Uber's journey with QueryGPT exemplifies the transformative potential of generative AI in operational analytics. By reducing manual effort and empowering teams with intelligent tools, businesses can unlock unprecedented productivity gains.</p>
<p>Whether you're in e-commerce, customer support, or any data-intensive field, the principles behind Uber's success can guide your own AI-driven innovations.</p>
<p>Want to delve deeper into the technical details? Check out Uber's engineering blog <a target="_blank" href="https://www.uber.com/en-TW/blog/query-gpt/">here</a> <a target="_blank" href="https://www.uber.com/en-TW/blog/query-gpt/"></a>for the full story.</p>
]]></content:encoded></item><item><title><![CDATA[A Deep Dive into Google's "Agents" White Paper: Hype or Revolution?]]></title><description><![CDATA[Video
https://www.youtube.com/watch?v=FgRGwnpd2HY
 
Google's recent white paper on "Agents" has created quite a buzz.
The paper explores the concept of AI agents and delves into their architecture and potential. Let's break down what this white paper...]]></description><link>https://zahere.com/a-deep-dive-into-googles-agents-white-paper-hype-or-revolution</link><guid isPermaLink="true">https://zahere.com/a-deep-dive-into-googles-agents-white-paper-hype-or-revolution</guid><category><![CDATA[agentic AI]]></category><category><![CDATA[#agent]]></category><category><![CDATA[agents]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Fri, 10 Jan 2025 07:39:43 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1736494697057/2b963ba4-c065-4f0a-9027-43f66dc9eb96.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-video">Video</h2>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=FgRGwnpd2HY">https://www.youtube.com/watch?v=FgRGwnpd2HY</a></div>
<p> </p>
<p>Google's recent white paper on "Agents" has created quite a buzz.</p>
<p>The paper explores the concept of AI agents and delves into their architecture and potential. Let's break down what this white paper offers, its key takeaways, and some areas where it could improve.</p>
<h2 id="heading-the-marketing-angle-a-platform-centric-view">The Marketing Angle: A Platform-Centric View?</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736494044585/727245a6-4b01-41b8-9590-5debbe7fcccd.png" alt class="image--center mx-auto" /></p>
<p>At first glance, the white paper feels like a marketing tool for Google's <strong>Vertex AI</strong>. And that's perfectly fine—after all, companies often use such publications to showcase their platforms.</p>
<p>However, adopting a more <strong>platform-agnostic</strong> approach could have made the paper more universally applicable.</p>
<p>For instance, many examples in the white paper are tied to <strong>Vertex AI-specific features</strong>, which might be unfamiliar to those using other agentic frameworks.</p>
<p>Additionally, certain concepts, like extensions, are introduced but not elaborated on in sufficient detail, leaving room for better documentation and clarity.</p>
<p>Despite these limitations, the paper provides a solid starting point for understanding agents. Let’s dive into the key concepts.</p>
<hr />
<h2 id="heading-what-is-an-agent">What is an Agent?</h2>
<p>Google defines an agent as:</p>
<blockquote>
<p>"An application that attempts to achieve a goal by observing the world and acting upon it using the tools at its disposal."</p>
</blockquote>
<p>This definition is both simple and powerful. It captures the essence of what an agent is without overcomplicating things. While many experts on platforms like LinkedIn and YouTube often layer terms like reasoning, context awareness, and more onto the definition, the core idea remains straightforward.</p>
<p>Interestingly, my personal favorite definition comes from Hugging Face, which describes AI agents as:</p>
<blockquote>
<p>"Programs where LLM outputs control the workflows."</p>
</blockquote>
<p>This succinctly highlights the operational dynamics of agents, especially when integrated with language models.</p>
<hr />
<h2 id="heading-the-agentic-architecture-core-components">The Agentic Architecture: Core Components</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736494086693/f77bedcc-e7ed-4248-bd9a-76caca2c58a1.png" alt class="image--center mx-auto" /></p>
<p>The white paper also details the architecture of agents, a topic I’ve previously discussed on my channel. Here's a simplified breakdown of the <strong>three primary components</strong> that define an agentic system:</p>
<h3 id="heading-1-the-model">1. <strong>The Model</strong></h3>
<p>At the heart of any agent lies a <strong>language model</strong>. This serves as the foundation for the agent's intelligence and capabilities. Trained on extensive datasets, the model enables the agent to comprehend language, process instructions, and provide knowledge.</p>
<p>In an agentic framework, the model is not just a passive responder. Its capabilities drive the decision-making processes within the orchestration layer.</p>
<h3 id="heading-2-the-tools">2. <strong>The Tools</strong></h3>
<p>Tools are what set agents apart from simple LLM calls. Since LLMs are inherently limited—they can’t interact with external systems or access real-time information—tools extend their capabilities.</p>
<p>Agents use tools to interact with the external world, making them more dynamic and useful. Frameworks like <strong>LangChain</strong> and <strong>LlamaIndex</strong> exemplify how tools can augment the performance of agentic systems, enabling them to achieve their goals effectively.</p>
<h3 id="heading-3-the-orchestration-layer">3. <strong>The Orchestration Layer</strong></h3>
<p>Often referred to as the <strong>reasoning loop</strong>, this layer governs the agent's ability to:</p>
<ul>
<li><p><strong>Plan</strong>: Decide the next steps in a workflow.</p>
</li>
<li><p><strong>Reason</strong>: Analyze the gathered information.</p>
</li>
<li><p><strong>Execute</strong>: Take action based on the plan.</p>
</li>
</ul>
<p>This iterative process is the backbone of an agent’s functionality, ensuring it can adapt and respond intelligently to various scenarios.</p>
<hr />
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736494138662/bce10d82-cc83-416a-9a53-9f0e57fc1b2f.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-tools-key-takeaways"><strong>Tools (Key Takeaways)</strong></h2>
<ol>
<li><p><strong>Extensions</strong></p>
<ul>
<li><p><strong>Definition</strong>: Extensions are interfaces that bridge the gap between APIs and agents. They allow for seamless API execution by teaching agents how to use APIs via examples.</p>
</li>
<li><p><strong>Use Case</strong>: Extensions are ideal for scenarios where the agent needs to dynamically interact with APIs like booking flights or fetching weather data. They reduce ambiguity in API calls by guiding the agent with context and examples.</p>
</li>
<li><p><strong>Your Example</strong>: Your custom tool implementation fits here because it defines tools with a name and description, guiding the LLM to invoke the correct tool and arguments based on context.</p>
</li>
</ul>
</li>
<li><p><strong>Functions</strong></p>
<ul>
<li><p><strong>Definition</strong>: Functions are reusable logic modules that allow developers to define behavior and handle specific tasks.</p>
</li>
<li><p><strong>Difference</strong>: Unlike extensions, functions offload API execution to client-side logic or middleware, especially in cases where security or authentication constraints prevent direct calls from the LLM.</p>
</li>
<li><p><strong>Your Observations</strong>: Google's distinction clarifies that functions give developers fine-grained control and decouple execution from the agent, making iteration easier without redeploying infrastructure.</p>
</li>
</ul>
</li>
</ol>
]]></content:encoded></item><item><title><![CDATA[Core Skills Every Full-Stack Engineer Needs to Stay Relevant in the Age of AI]]></title><description><![CDATA[Today I want to share something I deeply believe will shape the future of software engineering.
As we approach 2025, there are rapid advancements in technology that we, as engineers, cannot afford to ignore. Whether you’re looking to add value to you...]]></description><link>https://zahere.com/core-skills-every-full-stack-engineer-needs-to-stay-relevant-in-the-age-of-ai</link><guid isPermaLink="true">https://zahere.com/core-skills-every-full-stack-engineer-needs-to-stay-relevant-in-the-age-of-ai</guid><category><![CDATA[2025]]></category><category><![CDATA[Roadmap]]></category><category><![CDATA[Full Stack Development]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[AI Engineer]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Tue, 31 Dec 2024 05:01:57 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1735621171782/f6b414a6-8048-41b9-bb85-f24d1e9857d8.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Today I want to share something I deeply believe will shape the future of software engineering.</p>
<p>As we approach 2025, there are rapid advancements in technology that we, as engineers, cannot afford to ignore. Whether you’re looking to add value to your organization, grow your skillset, or simply future-proof your career, there are certain core skills that will set you apart.</p>
<p>Let’s dive right in.</p>
<h2 id="heading-the-new-baseline-ai-and-business-context-understanding">The New Baseline: AI and Business Context Understanding</h2>
<p>Gone are the days when being a full-stack engineer meant just knowing how to develop and deploy applications. Today, understanding the "why" behind what you’re building is just as important as the "how."</p>
<p>If you find yourself working on user stories without knowing the business context or understanding the decisions made by your engineering or product managers, it’s time to rethink your approach.</p>
<p>Business understanding is what differentiates an average engineer from a great one.</p>
<p>And as AI continues to integrate into every facet of technology, this understanding will become even more critical.</p>
<p>Here’s my take on the essential skills every full-stack engineer should master to thrive in this new era.</p>
<hr />
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735620941850/c6ff589f-e764-4ccb-8c3b-00d64c71bb04.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-1-core-aiml-engineering-skills">1. <strong>Core AI/ML Engineering Skills</strong></h2>
<h3 id="heading-large-language-models-llms"><strong>Large Language Models (LLMs)</strong></h3>
<p>By now, you’ve probably experimented with tools like ChatGPT or integrated APIs into your projects. But to truly excel, you need to:</p>
<ul>
<li><p>Understand how LLMs work at a deeper level.</p>
</li>
<li><p>Learn about fine-tuning models for specific use cases.</p>
</li>
<li><p>Master <strong>prompt engineering</strong> – arranging instructions in ways that yield the best results.</p>
</li>
</ul>
<h4 id="heading-retrieval-augmented-generation-rag"><strong>Retrieval-Augmented Generation (RAG)</strong></h4>
<p>RAG will remain indispensable for enterprise applications, even as LLMs handle increasingly large context windows. Why? Because RAG ensures that only the most relevant data is fed into the model, optimizing performance and cost. Gaining expertise in designing RAG pipelines and integrating them with business workflows is a must.</p>
<h4 id="heading-ai-agents"><strong>AI Agents</strong></h4>
<p>The buzzword for 2025 is "agents." Multi-agent frameworks and systems are set to revolutionize how we build solutions. Understanding how to design and orchestrate these systems will keep you ahead of the curve.</p>
<hr />
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735620980708/ec0b01d0-07c9-4279-a544-3319546b5189.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-2-technical-stack-mastery">2. <strong>Technical Stack Mastery</strong></h3>
<h4 id="heading-python-the-go-to-language"><strong>Python: The Go-To Language</strong></h4>
<p>Python’s ecosystem for AI and ML is unparalleled. From frameworks like PyTorch and scikit-learn to tools like LangChain and LlamaIndex, Python should be in every engineer’s toolkit. If you’re new to the space, start experimenting with these frameworks.</p>
<h4 id="heading-cloud-platforms-and-vector-databases"><strong>Cloud Platforms and Vector Databases</strong></h4>
<ul>
<li><p>Familiarize yourself with cloud platforms like AWS, Azure, or GCP, and tools like VMware Tanzu.</p>
</li>
<li><p>Learn about vector databases, such as PGVector, which allow efficient storage and retrieval of embeddings. Even spinning up a simple Docker instance can give you hands-on experience.</p>
</li>
</ul>
<h4 id="heading-api-development"><strong>API Development</strong></h4>
<p>Frameworks like FastAPI and Flask make it easier to create and deploy AI-powered web applications. Combine this with an understanding of socket programming for real-time communication tools like chatbots, and you’ll be unstoppable.</p>
<hr />
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735621019139/eeb0b6f7-6ae4-4e1e-bdac-f569cb7a1537.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-3-project-and-professional-skills">3. <strong>Project and Professional Skills</strong></h3>
<h4 id="heading-business-context-awareness"><strong>Business Context Awareness</strong></h4>
<p>This is the game-changer. Knowing how your application aligns with business goals will not only make you a better engineer but also ensure your contributions are recognized. Always ask:</p>
<ul>
<li><p>Why is this feature important?</p>
</li>
<li><p>How will this decision impact the user or business?</p>
</li>
</ul>
<h4 id="heading-experimentation-over-perfection"><strong>Experimentation Over Perfection</strong></h4>
<p>Start small. Use APIs to create prototypes and simulate workflows. For instance:</p>
<ul>
<li><p>Identify areas in your organization where generative AI can add value.</p>
</li>
<li><p>Build mock solutions with sample data to prove concepts.</p>
</li>
</ul>
<h4 id="heading-mlops-and-deployment"><strong>MLOps and Deployment</strong></h4>
<p>Understanding MLOps tools and practices is becoming essential for deploying AI solutions. Even if you’re not directly managing infrastructure, knowing how to streamline deployment pipelines will make you invaluable to your team.</p>
<hr />
<h3 id="heading-my-journey-from-experimentation-to-mastery">My Journey: From Experimentation to Mastery</h3>
<p>When I transitioned into AI engineering in early 2023, I didn’t start by mastering machine learning fundamentals. Instead, I began tinkering with LLM APIs and building prototypes. This hands-on experimentation allowed me to solve real-world problems while gradually deepening my knowledge of ML fundamentals over the next six months.</p>
<p>This approach worked wonders for me, and I recommend it to anyone looking to enter the field. Don’t get bogged down by theory. Instead, balance learning with application. Keep a 60/40 or even 50/50 ratio between theory and practice to maximize your growth.</p>
<hr />
<h3 id="heading-final-thoughts">Final Thoughts</h3>
<p>The expectations for full-stack engineers are evolving, and AI skills are no longer optional.</p>
<p>By mastering core AI/ML engineering concepts, staying updated with the latest frameworks, and understanding business context, you’ll position yourself as a leader in your field.</p>
<p>So, what’s your game plan for 2025? Let me know in the comments below. And if you’re ready to start tinkering, I’ve got tons of tutorials on my channel to help you get started. Until next time, happy coding!</p>
]]></content:encoded></item><item><title><![CDATA[Unlocking the Power of Dynamic Prompting with Jinja2]]></title><description><![CDATA[Colab Notebook: https://colab.research.google.com/drive/18nzaXc7__KDYaPSRyf2mZyCK_pj4ON26
https://www.youtube.com/watch?v=Rq2zM7_5yw0
 
Dynamic prompt generation has become a cornerstone of modern AI workflows.
Whether you're building personalized em...]]></description><link>https://zahere.com/unlocking-the-power-of-dynamic-prompting-with-jinja2</link><guid isPermaLink="true">https://zahere.com/unlocking-the-power-of-dynamic-prompting-with-jinja2</guid><category><![CDATA[#PromptEngineering]]></category><category><![CDATA[Jinja2]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[agents]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Sun, 22 Dec 2024 13:42:53 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1734874685443/845e5f96-733b-498b-8521-62ee72e13ca9.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Colab Notebook:</strong> <a target="_blank" href="https://colab.research.google.com/drive/18nzaXc7__KDYaPSRyf2mZyCK_pj4ON26"><strong>https://colab.research.google.com/drive/18nzaXc7__KDYaPSRyf2mZyCK_pj4ON26</strong></a></p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=Rq2zM7_5yw0">https://www.youtube.com/watch?v=Rq2zM7_5yw0</a></div>
<p> </p>
<p>Dynamic prompt generation has become a cornerstone of modern AI workflows.</p>
<p>Whether you're building personalized email campaigns, travel itineraries, or AI-driven recommendations, the ability to generate structured content dynamically is invaluable.</p>
<p>In this blog, we'll explore how <strong>Jinja2</strong>, a powerful templating engine, stands out in this domain and compare it with tools like LangChain for crafting dynamic prompts.</p>
<hr />
<h2 id="heading-why-dynamic-prompting-matters">Why Dynamic Prompting Matters</h2>
<p>If you manage an AI assistant tasked with creating personalized travel itineraries or summarizing user activities, static prompts won't cut it here – you need templates that adapt to the data at hand. This is where tools like <strong>Jinja2</strong> and LangChain's prompt templates shine.</p>
<hr />
<h2 id="heading-jinja2-the-all-rounder-for-dynamic-templates">Jinja2: The All-Rounder for Dynamic Templates</h2>
<p>Jinja2 is a versatile templating engine widely known for its use in web development but equally adept at generating dynamic text for emails, reports, and prompts. Here's why Jinja2 should be in your toolkit:</p>
<h3 id="heading-1-seamless-integration-of-logic">1. <strong>Seamless Integration of Logic</strong></h3>
<p>Jinja2 allows you to embed loops, conditionals, and filters directly in your templates. For example, creating tailored recommendations becomes straightforward:</p>
<pre><code class="lang-python">Dear {{ user_name }},

Here’s a summary of your recent activities:
{% <span class="hljs-keyword">for</span> activity <span class="hljs-keyword">in</span> activities %}
- On {{ activity.date }}: {{ activity.description }}
{% endfor %}

{% <span class="hljs-keyword">if</span> status == <span class="hljs-string">"pass"</span> %}
Congratulations on passing the test. Keep up the great work!
{% <span class="hljs-keyword">else</span> %}
Keep trying, <span class="hljs-keyword">and</span> yo<span class="hljs-string">u'll get there!
{% endif %}</span>
</code></pre>
<h3 id="heading-2-readable-and-reusable">2. <strong>Readable and Reusable</strong></h3>
<p>With its clean syntax, Jinja2 makes templates easy to maintain and reuse across projects. It's perfect for use cases like:</p>
<ul>
<li><p>Personalized emails</p>
</li>
<li><p>Travel itineraries</p>
</li>
<li><p>AI-driven content generation</p>
</li>
</ul>
<h3 id="heading-3-performance-efficiency">3. <strong>Performance Efficiency</strong></h3>
<p>Jinja2 minimizes overhead, making it an excellent choice for applications requiring rapid dynamic rendering.</p>
<hr />
<h2 id="heading-comparing-jinja2-and-langchain-for-prompt-templates">Comparing Jinja2 and LangChain for Prompt Templates</h2>
<p>While Jinja2 excels in general-purpose dynamic content generation, LangChain's <code>PromptTemplate</code> is specifically designed for AI workflows, making it the go-to for LLM integrations.</p>
<h3 id="heading-langchain-example">LangChain Example:</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain.prompts <span class="hljs-keyword">import</span> PromptTemplate

template = <span class="hljs-string">"""
Dear {user_name},

Here’s a summary of your recent activities:
{activities}

Here are some tailored recommendations for you:
{recommendations}

{closing_note}
"""</span>

prompt = PromptTemplate(
    input_variables=[<span class="hljs-string">"user_name"</span>, <span class="hljs-string">"activities"</span>, <span class="hljs-string">"recommendations"</span>, <span class="hljs-string">"closing_note"</span>],
    template=template,
)

email = prompt.format(
    user_name=<span class="hljs-string">"Alice"</span>,
    activities=<span class="hljs-string">"- Completed the Python course.\n- Joined the AI workshop."</span>,
    recommendations=<span class="hljs-string">"- Read 'Deep Learning for Beginners'.\n- Join the Advanced AI Projects Club."</span>,
    closing_note=<span class="hljs-string">"Congratulations on passing the test!"</span>,
)

print(email)
</code></pre>
<h3 id="heading-key-differences">Key Differences:</h3>
<ul>
<li><p><strong>Flexibility</strong>: Jinja2 supports complex logic directly in the template, while LangChain separates logic and content.</p>
</li>
<li><p><strong>AI Integration</strong>: LangChain is optimized for workflows where prompts are fed into LLMs.</p>
</li>
<li><p><strong>Learning Curve</strong>: Jinja2 has a gentler curve for general developers, whereas LangChain is ideal for those already in the AI ecosystem.</p>
</li>
</ul>
<hr />
<h2 id="heading-real-world-use-case-personalized-travel-itineraries">Real-World Use Case: Personalized Travel Itineraries</h2>
<p>Using Jinja2, you can craft luxurious travel experiences tailored to user preferences. Here's an example:</p>
<h3 id="heading-input-data">Input Data</h3>
<pre><code class="lang-python">codeinput_data = {
    <span class="hljs-string">"name"</span>: <span class="hljs-string">"John Doe"</span>,
    <span class="hljs-string">"destination"</span>: <span class="hljs-string">"Paris, France"</span>,
    <span class="hljs-string">"interests"</span>: [<span class="hljs-string">"art"</span>, <span class="hljs-string">"history"</span>, <span class="hljs-string">"fine dining"</span>],
    <span class="hljs-string">"travel_type"</span>: <span class="hljs-string">"luxury"</span>,
    <span class="hljs-string">"suggested_activities"</span>: [
        {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Private Louvre Tour"</span>, <span class="hljs-string">"description"</span>: <span class="hljs-string">"Explore iconic art pieces with a guide."</span>},
        {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Seine River Dinner Cruise"</span>, <span class="hljs-string">"description"</span>: <span class="hljs-string">"Enjoy a gourmet dinner on a Seine cruise."</span>},
    ],
    <span class="hljs-string">"recommended_accommodations"</span>: [
        {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Le Meurice"</span>, <span class="hljs-string">"description"</span>: <span class="hljs-string">"5-star luxury hotel with Michelin dining."</span>},
    ],
}
</code></pre>
<h3 id="heading-jinja2-template">Jinja2 Template</h3>
<pre><code class="lang-python"> <span class="hljs-string">"""
Traveler Profile:
- Name: {{ name }}
- Age: {{ age }}
- Travel Dates: {{ travel_dates }}
- Travel Destination: {{ destination }}
- Interests: {{ interests|join(", ") }}

{% if travel_type == 'luxury' %}
The traveler prefers a luxury experience. Suggest the following premium activities and accommodations:
{% elif travel_type == 'adventure' %}
The traveler seeks adventure. Recommend these thrilling activities and adventurous destinations:
{% else %}
The traveler is interested in a balanced experience. Consider these activities and attractions:
{% endif %}

{% for activity in suggested_activities %}
- {{ activity.name }}: {{ activity.description }}
  {% if activity.requirements %}
  Requirements: {{ activity.requirements|join(", ") }}
  {% endif %}
{% endfor %}

{% if recommended_accommodations %}
Recommended Accommodations:
{% for accommodation in recommended_accommodations %}
- {{ accommodation.name }}: {{ accommodation.description }}
  Location: {{ accommodation.location }}
  Amenities: {{ accommodation.amenities|join(", ") }}
{% endfor %}
{% endif %}

Traveler's Notes:
{% if traveler_notes %}
{% for note in traveler_notes %}
- {{ note }}
{% endfor %}
{% endif %}

Based on the above information, create a 3-day itinerary tailored to the traveler’s preferences and needs. Ensure activities, meals, and downtime are appropriately balanced.
""</span>
</code></pre>
<h3 id="heading-output">Output</h3>
<pre><code class="lang-markdown">Traveler Profile:
<span class="hljs-bullet">-</span> Name: John Doe
<span class="hljs-bullet">-</span> Age: 35
<span class="hljs-bullet">-</span> Travel Dates: 2024-01-15 to 2024-01-18
<span class="hljs-bullet">-</span> Travel Destination: Paris, France
<span class="hljs-bullet">-</span> Interests: art, history, fine dining, luxury shopping


The traveler prefers a luxury experience. Suggest the following premium activities and accommodations:



<span class="hljs-bullet">-</span> Private Louvre Tour: Enjoy a private, guided tour of the Louvre Museum, exploring its iconic art pieces.

  Requirements: Comfortable walking shoes, Museum pass


<span class="hljs-bullet">-</span> Seine River Dinner Cruise: Experience a luxurious evening with a gourmet dinner on a Seine River cruise.

  Requirements: Formal attire


<span class="hljs-bullet">-</span> Champs-Élysées Shopping Tour: Indulge in a day of shopping at high-end boutiques along the Champs-Élysées.




Recommended Accommodations:

<span class="hljs-bullet">-</span> Le Meurice: A 5-star luxury hotel offering Michelin-star dining and exceptional service.
  Location: Rue de Rivoli, Paris
  Amenities: Spa, Fine dining, Concierge service

<span class="hljs-bullet">-</span> Hôtel Plaza Athénée: Iconic Parisian hotel with stunning views of the Eiffel Tower.
  Location: Avenue Montaigne, Paris
  Amenities: Luxury suites, Haute couture stores, Gourmet restaurants



Traveler's Notes:


<span class="hljs-bullet">-</span> Has dietary restrictions: no shellfish.

<span class="hljs-bullet">-</span> Prefers private tours over group activities.



Based on the above information, create a 3-day itinerary tailored to the traveler’s preferences and needs. Ensure activities, meals, and downtime are appropriately balanced.
</code></pre>
<hr />
<h2 id="heading-conclusion">Conclusion</h2>
<p>If you're building general-purpose templates or working with structured data, Jinja2 is a clear winner. For AI-centric workflows, LangChain simplifies the integration with LLMs.</p>
<p>Ready to take your prompts to the next level? Dive into Jinja2 or LangChain and unlock the full potential of dynamic content creation!</p>
]]></content:encoded></item><item><title><![CDATA[How to Build a Price Monitoring Agent with Pydantic AI]]></title><description><![CDATA[Keeping track of fluctuating product prices across e-commerce platforms can be a daunting task.
Whether you're tracking a personal wishlist or monitoring competitors' pricing for your business, automating this process can save time and effort.
In thi...]]></description><link>https://zahere.com/how-to-build-a-price-monitoring-agent-with-pydantic-ai</link><guid isPermaLink="true">https://zahere.com/how-to-build-a-price-monitoring-agent-with-pydantic-ai</guid><category><![CDATA[agentic AI]]></category><category><![CDATA[agents]]></category><category><![CDATA[ai agents]]></category><category><![CDATA[Multi-Agent Systems (MAS)]]></category><category><![CDATA[llm]]></category><category><![CDATA[price-tracking]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Mon, 16 Dec 2024 06:14:10 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1734329453319/6840f9e6-c72a-4b00-a613-26c965f01633.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Keeping track of fluctuating product prices across e-commerce platforms can be a daunting task.</p>
<p>Whether you're tracking a personal wishlist or monitoring competitors' pricing for your business, automating this process can save time and effort.</p>
<p>In this guide, we’ll explore how to build a <strong>price monitoring agent</strong> using the <strong>Pydantic AI framework</strong>—a robust agentic framework from the creators of Pantic, a popular data validation library.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1734328934789/a46830a9-7c76-4205-a503-50bba31bcdbf.png" alt class="image--center mx-auto" /></p>
<p>This tutorial is part one of a series. Today, we’ll focus on building a scraper agent to extract key product details like title, description, price, and more.</p>
<p>In the next part, we’ll expand this agent to store data in a database and send notifications for price changes.</p>
<hr />
<h2 id="heading-what-is-pydantic-ai">What is Pydantic AI?</h2>
<p><a target="_blank" href="https://ai.pydantic.dev/">Pydantic AI</a> is revolutionizing the way developers build applications that leverage Generative AI. As a Python Agent Framework, it simplifies the creation of production-grade applications by integrating robust data validation with the power of LLMs. Here’s why Pydantic AI stands out:</p>
<ul>
<li><p><strong>Built on Proven Foundations</strong>: Developed by the creators of Pydantic, which is widely used in various AI frameworks like OpenAI and LangChain, Pydantic AI inherits a strong legacy of type safety and structured data management.</p>
</li>
<li><p><strong>Model-Agnostic Flexibility</strong>: Currently supporting models like OpenAI, Gemini, and Groq, Pydantic AI allows developers to easily implement support for additional models through a simple interface. This flexibility ensures that your application can adapt to various AI technologies without significant overhead.</p>
</li>
<li><p><strong>Enhanced Developer Experience</strong>: With features like vanilla Python control flow and a novel dependency injection system, Pydantic AI empowers developers to apply familiar coding practices. This leads to more maintainable code and a smoother development process.</p>
</li>
<li><p><strong>Streamlined Response Validation</strong>: The framework not only validates incoming data but also ensures that responses from LLMs are structured and validated, enhancing reliability in application behavior.</p>
</li>
</ul>
<hr />
<h2 id="heading-overview-of-the-price-monitoring-agent">Overview of the Price Monitoring Agent</h2>
<p>Our agent will:</p>
<ol>
<li><p>Scrape product details (title, description, price, currency, and image URL) from a given URL.</p>
</li>
<li><p>Parse the information into a structured format.</p>
</li>
<li><p>Prepare for database storage and notification handling (to be implemented in part two).</p>
</li>
</ol>
<p>Here’s how the process works (diagram above)</p>
<ol>
<li><p><strong>Input</strong>: Product page URL</p>
</li>
<li><p><strong>Scraper Tool</strong>: Extracts structured data using Beautiful Soup and Markdownify.</p>
</li>
<li><p><strong>Agent</strong>: Processes the scraped data using Pydantic AI for type-safe responses.</p>
</li>
</ol>
<h2 id="heading-watch-the-video-for-full-tutorial">Watch the video for full tutorial</h2>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=hlropi13fO8">https://www.youtube.com/watch?v=hlropi13fO8</a></div>
]]></content:encoded></item><item><title><![CDATA[Building a Multi-Agent Orchestrator: A Step-by-Step Guide]]></title><description><![CDATA[Today, we’re diving into an exciting project: creating a Multi-Agent Orchestrator.
This post is an extension of my earlier guide, "Building an AI Agent from Scratch."

If you’re new here, I recommend revisiting that post to get up to speed, as we’ll ...]]></description><link>https://zahere.com/building-a-multi-agent-orchestrator-a-step-by-step-guide</link><guid isPermaLink="true">https://zahere.com/building-a-multi-agent-orchestrator-a-step-by-step-guide</guid><category><![CDATA[generative ai]]></category><category><![CDATA[#agent]]></category><category><![CDATA[agentic AI]]></category><category><![CDATA[Multi-Agent Systems (MAS)]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Fri, 06 Dec 2024 13:57:49 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1733493348874/c3e17609-baa3-4aa1-b450-e941768af604.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Today, we’re diving into an exciting project: <strong>creating a Multi-Agent Orchestrator</strong>.</p>
<p><a target="_blank" href="https://zahere.com/how-to-build-an-ai-agent-without-using-any-libraries-a-step-by-step-guide">This post is an extension of my earlier guide, <em>"Building an AI Agent from Scratch."</em></a></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733458430168/17896b93-8b33-46ae-8683-e578e493ec3b.png" alt class="image--center mx-auto" /></p>
<p>If you’re new here, I recommend revisiting that post to get up to speed, as we’ll build upon its concepts and code.</p>
<p>In this project, we’ll tackle orchestrating actions between multiple agents, enabling seamless execution of tasks such as fetching weather information and the current time. Let’s jump in!</p>
<hr />
<h3 id="heading-what-is-a-multi-agent-orchestrator"><strong>What Is a Multi-Agent Orchestrator?</strong></h3>
<p>A Multi-Agent Orchestrator is a system that:</p>
<ol>
<li><p><strong>Identifies the intent</strong> of a user’s input.</p>
</li>
<li><p><strong>Selects the appropriate agent</strong> to handle the request.</p>
</li>
<li><p><strong>Executes tasks</strong> using tools associated with the agent.</p>
</li>
</ol>
<p>Think of it as a manager assigning tasks to specialized team members. This orchestration ensures complex queries involving multiple tasks are handled efficiently.</p>
<hr />
<h3 id="heading-what-well-build"><strong>What We’ll Build</strong></h3>
<p>We’ll create:</p>
<ul>
<li><p><strong>Agents</strong>: Specialized entities to handle tasks like fetching weather or time.</p>
</li>
<li><p><strong>Tools</strong>: Functional utilities for the agents, such as APIs or database queries.</p>
</li>
<li><p><strong>Orchestrator</strong>: The central system managing task delegation and execution.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1733458475635/979ad64d-2de4-4569-8c06-c8c081e78c71.png" alt class="image--center mx-auto" /></p>
<hr />
<h3 id="heading-key-components-of-an-agent"><strong>Key Components of an Agent</strong></h3>
<p>An agent has three main components:</p>
<ol>
<li><p><strong>Reasoning Loop</strong>: Decides the next action based on context.</p>
</li>
<li><p><strong>Model</strong>: Uses a language model (LLM) for decision-making.</p>
</li>
<li><p><strong>Tools</strong>: A list of utilities to perform specific tasks.</p>
</li>
</ol>
<p>Our agents will dynamically decide which tool to use, making them highly adaptable.</p>
<hr />
<h3 id="heading-the-agent-class"><strong>The Agent Class</strong></h3>
<p>Here’s a high-level breakdown of the <code>Agent</code> class:</p>
<ul>
<li><p><strong>Constructor</strong>: Initializes the agent with a name, description, tools, and an LLM model.</p>
</li>
<li><p><strong>Process Input</strong>: Takes user input, decides on a tool, and executes the task.</p>
</li>
<li><p><strong>Prompting</strong>: Constructs a prompt for the LLM to guide decision-making.</p>
</li>
</ul>
<p>Agents also handle parsing JSON responses from the LLM to ensure smooth execution.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> abc <span class="hljs-keyword">import</span> ABC, abstractmethod
<span class="hljs-keyword">import</span> ast
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> requests
<span class="hljs-keyword">from</span> llm.llm_ops <span class="hljs-keyword">import</span> query_llm
<span class="hljs-keyword">from</span> tools.base_tool <span class="hljs-keyword">import</span> Tool
<span class="hljs-keyword">import</span> json


<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Agent</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, Name: str, Description: str, Tools: list, Model: str</span>):</span>        
        self.memory = []
        self.name = Name
        self.description = Description
        self.tools = Tools
        self.model = Model
        self.max_memory = <span class="hljs-number">10</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">json_parser</span>(<span class="hljs-params">self, input_string</span>):</span>

      print(type(input_string))

      python_dict = ast.literal_eval(input_string)
      json_string = json.dumps(python_dict)
      json_dict = json.loads(json_string)

      <span class="hljs-keyword">if</span> isinstance(json_dict, dict) <span class="hljs-keyword">or</span> isinstance(json_dict,list):
        <span class="hljs-keyword">return</span> json_dict

      <span class="hljs-keyword">raise</span> <span class="hljs-string">"Invalid JSON response"</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">process_input</span>(<span class="hljs-params">self, user_input</span>):</span>
        self.memory.append(<span class="hljs-string">f"User: <span class="hljs-subst">{user_input}</span>"</span>)
        <span class="hljs-number">12</span>

        context = <span class="hljs-string">"\n"</span>.join(self.memory)
        tool_descriptions = <span class="hljs-string">"\n"</span>.join([<span class="hljs-string">f"- <span class="hljs-subst">{tool.name()}</span>: <span class="hljs-subst">{tool.description()}</span>"</span> <span class="hljs-keyword">for</span> tool <span class="hljs-keyword">in</span> self.tools])
        response_format = {<span class="hljs-string">"action"</span>:<span class="hljs-string">""</span>, <span class="hljs-string">"args"</span>:<span class="hljs-string">""</span>}

        prompt = <span class="hljs-string">f"""Context:
        <span class="hljs-subst">{context}</span>

        Available tools:
        <span class="hljs-subst">{tool_descriptions}</span>

        Based on the user's input and context, decide if you should use a tool or respond directly.        
        If you identify a action, respond with the tool name and the arguments for the tool.        
        If you decide to respond directly to the user then make the action "respond_to_user" with args as your response in the following format.

        Response Format:
        <span class="hljs-subst">{response_format}</span>

        """</span>

        response = query_llm(prompt)
        self.memory.append(<span class="hljs-string">f"Agent: <span class="hljs-subst">{response}</span>"</span>)

        response_dict = self.json_parser(response)

        <span class="hljs-comment"># Check if any tool can handle the input</span>
        <span class="hljs-keyword">for</span> tool <span class="hljs-keyword">in</span> self.tools:
            <span class="hljs-keyword">if</span> tool.name().lower() == response_dict[<span class="hljs-string">"action"</span>].lower():
                <span class="hljs-keyword">return</span> tool.use(response_dict[<span class="hljs-string">"args"</span>])

        <span class="hljs-keyword">return</span> response_dict
</code></pre>
<hr />
<h3 id="heading-the-orchestrator"><strong>The Orchestrator</strong></h3>
<p>The orchestrator coordinates multiple agents:</p>
<ol>
<li><p>Accepts user input.</p>
</li>
<li><p>Selects the right agent based on the intent.</p>
</li>
<li><p>Manages task execution, including cases where multiple tasks are requested.</p>
</li>
</ol>
<p><strong>Core Features of the Orchestrator</strong>:</p>
<ul>
<li><p>Maintains context by storing user queries, agent responses, and intermediate results.</p>
</li>
<li><p>Uses a reasoning loop to determine the next steps.</p>
</li>
<li><p>Constructs prompts to guide the LLM in selecting the right agent and tools.</p>
</li>
</ul>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> ast
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">from</span> llm.llm_ops <span class="hljs-keyword">import</span> query_llm
<span class="hljs-keyword">from</span> agents.base_agent <span class="hljs-keyword">import</span> Agent
<span class="hljs-keyword">from</span> logger <span class="hljs-keyword">import</span> log_message

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AgentOrchestrator</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, agents: list[Agent]</span>):</span>
        self.agents = agents
        self.memory = []  <span class="hljs-comment"># Stores the reasoning and action steps taken</span>
        self.max_memory = <span class="hljs-number">10</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">json_parser</span>(<span class="hljs-params">self, input_string</span>):</span>

      print(type(input_string))

      python_dict = ast.literal_eval(input_string)
      json_string = json.dumps(python_dict)
      json_dict = json.loads(json_string)

      <span class="hljs-keyword">if</span> isinstance(json_dict, dict) <span class="hljs-keyword">or</span> isinstance(json_dict,list):
        <span class="hljs-keyword">return</span> json_dict

      <span class="hljs-keyword">raise</span> <span class="hljs-string">"Invalid JSON response"</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">orchestrate_task</span>(<span class="hljs-params">self, user_input: str</span>):</span>        
        self.memory = self.memory[-self.max_memory:]

        context = <span class="hljs-string">"\n"</span>.join(self.memory)

        print(<span class="hljs-string">f"Context: <span class="hljs-subst">{context}</span>"</span>)

        response_format = {<span class="hljs-string">"action"</span>:<span class="hljs-string">""</span>, <span class="hljs-string">"input"</span>:<span class="hljs-string">""</span>, <span class="hljs-string">"next_action"</span>:<span class="hljs-string">""</span>}

        <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_prompt</span>(<span class="hljs-params">user_input</span>):</span>
            <span class="hljs-keyword">return</span> <span class="hljs-string">f"""

                Use the context from memory to plan next steps.                
                Context:
                <span class="hljs-subst">{context}</span>

                You are an expert intent classifier.
                You need will use the context provided and the user's input to classify the intent select the appropriate agent.                
                You will rewrite the input for the agent so that the agent can efficiently execute the task.                                                

                Here are the available agents and their descriptions:
                <span class="hljs-subst">{<span class="hljs-string">", "</span>.join([<span class="hljs-string">f"- <span class="hljs-subst">{agent.name}</span>: <span class="hljs-subst">{agent.description}</span>"</span> <span class="hljs-keyword">for</span> agent <span class="hljs-keyword">in</span> self.agents])}</span>

                User Input:
                <span class="hljs-subst">{user_input}</span>              

                ###Guidelines###
                - Sometimes you might have to use multiple agent's to solve user's input. You have to do that in a loop.
                - The original userinput could have multiple tasks, you will use the context to understand the previous actions taken and the next steps you should take.
                - Read the context, take your time to understand, see if there were many tasks and if you executed them all
                - If there are no actions to be taken, then make the action "respond_to_user" with your final thoughts combining all previous responses as input.
                - Respond with "respond_to_user" only when there are no agents to select from or there is no next_action
                - You will return the agent name in the form of <span class="hljs-subst">{response_format}</span>
                - Always return valid JSON like <span class="hljs-subst">{response_format}</span> and nothing else.                

                """</span>


        response = <span class="hljs-string">""</span>
        loop_count = <span class="hljs-number">0</span>
        self.memory = self.memory[<span class="hljs-number">-10</span>:]        
        prompt = get_prompt(user_input)
        llm_response = query_llm(prompt)

        llm_response = self.json_parser(llm_response)
        print(<span class="hljs-string">f"LLM Response: <span class="hljs-subst">{llm_response}</span>"</span>)

        self.memory.append(<span class="hljs-string">f"Orchestrator: <span class="hljs-subst">{llm_response}</span>"</span>)


        action=  llm_response[<span class="hljs-string">"action"</span>]
        user_input = llm_response[<span class="hljs-string">"input"</span>]

        print(<span class="hljs-string">f"Action identified by LLM: <span class="hljs-subst">{action}</span>"</span>)


        <span class="hljs-keyword">if</span> action == <span class="hljs-string">"respond_to_user"</span>:
            <span class="hljs-keyword">return</span> llm_response
        <span class="hljs-keyword">for</span> agent <span class="hljs-keyword">in</span> self.agents:
            <span class="hljs-keyword">if</span> agent.name == action:
                print(<span class="hljs-string">"*******************Found Agent Name*******************************"</span>)
                agent_response = agent.process_input(user_input)
                print(<span class="hljs-string">f"<span class="hljs-subst">{action}</span> response: <span class="hljs-subst">{agent_response}</span>"</span>)
                self.memory.append(<span class="hljs-string">f"Agent Response for Task: <span class="hljs-subst">{agent_response}</span>"</span>)
                print(self.memory)
                <span class="hljs-keyword">return</span> agent_response                


    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">run</span>(<span class="hljs-params">self</span>):</span>
        print(<span class="hljs-string">"LLM Agent: Hello! How can I assist you today?"</span>)
        user_input = input(<span class="hljs-string">"You: "</span>)
        self.memory.append(<span class="hljs-string">f"User: <span class="hljs-subst">{user_input}</span>"</span>)

        <span class="hljs-keyword">while</span> <span class="hljs-literal">True</span>:            
            <span class="hljs-keyword">if</span> user_input.lower() <span class="hljs-keyword">in</span> [<span class="hljs-string">"exit"</span>, <span class="hljs-string">"bye"</span>, <span class="hljs-string">"close"</span>]:
                print(<span class="hljs-string">"See you later!"</span>)
                <span class="hljs-keyword">break</span>

            response = self.orchestrate_task(user_input)
            print(<span class="hljs-string">f"Final response of orchestrator <span class="hljs-subst">{response}</span>"</span>)
            <span class="hljs-keyword">if</span> isinstance(response, dict) <span class="hljs-keyword">and</span> response[<span class="hljs-string">"action"</span>] == <span class="hljs-string">"respond_to_user"</span>:                
                log_message(<span class="hljs-string">f"Reponse from Agent: <span class="hljs-subst">{response[<span class="hljs-string">"input"</span>]}</span>"</span>, <span class="hljs-string">"RESPONSE"</span>)
                user_input = input(<span class="hljs-string">"You: "</span>)
                self.memory.append(<span class="hljs-string">f"User: <span class="hljs-subst">{user_input}</span>"</span>)                
            <span class="hljs-keyword">elif</span> response == <span class="hljs-string">"No action or agent needed"</span>:
                print(<span class="hljs-string">"Reponse from Agent: "</span>, response)
                user_input = input(<span class="hljs-string">"You: "</span>)
            <span class="hljs-keyword">else</span>:
                user_input = response
</code></pre>
<hr />
<h3 id="heading-tools-in-action"><strong>Tools in Action</strong></h3>
<p>Agents use tools to perform tasks. For example:</p>
<ul>
<li><p><strong>Weather Tool</strong>: Fetches real-time weather data from OpenWeatherMap.</p>
</li>
<li><p><strong>Time Tool</strong>: Determines the local time for a given city, even without a timezone.</p>
</li>
</ul>
<p>Each tool includes:</p>
<ul>
<li><p>A <strong>name</strong> and <strong>description</strong> to guide the LLM.</p>
</li>
<li><p>A <strong>use method</strong> to perform the task.</p>
</li>
</ul>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> requests
<span class="hljs-keyword">from</span> tools.base_tool <span class="hljs-keyword">import</span> Tool

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">WeatherTool</span>(<span class="hljs-params">Tool</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">name</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">"Weather Tool"</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">description</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">"Provides weather information for a given location. The payload is just the location. Example: New York"</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">use</span>(<span class="hljs-params">self, location:str</span>):</span>        
        api_key = os.getenv(<span class="hljs-string">"OPENWEATHERMAP_API_KEY"</span>)
        url = <span class="hljs-string">f"http://api.openweathermap.org/data/2.5/weather?q=<span class="hljs-subst">{location}</span>&amp;appid=<span class="hljs-subst">{api_key}</span>&amp;units=metric"</span>
        response = requests.get(url)
        data = response.json()
        <span class="hljs-keyword">if</span> data[<span class="hljs-string">"cod"</span>] == <span class="hljs-number">200</span>:
            temp = data[<span class="hljs-string">"main"</span>][<span class="hljs-string">"temp"</span>]
            description = data[<span class="hljs-string">"weather"</span>][<span class="hljs-number">0</span>][<span class="hljs-string">"description"</span>]
            response = <span class="hljs-string">f"The weather in <span class="hljs-subst">{location}</span> is currently <span class="hljs-subst">{description}</span> with a temperature of <span class="hljs-subst">{temp}</span>°C."</span>
            print(response)
            <span class="hljs-keyword">return</span> response
        <span class="hljs-keyword">else</span>:
            <span class="hljs-keyword">return</span> <span class="hljs-string">f"Sorry, I couldn't find weather information for <span class="hljs-subst">{location}</span>."</span>
</code></pre>
<hr />
<h3 id="heading-demo-running-the-orchestrator"><strong>Demo: Running the Orchestrator</strong></h3>
<p>Here’s a quick demonstration:</p>
<ol>
<li><p><strong>Query</strong>: <em>“What’s the weather in Bangalore, and what’s the current time?”</em></p>
</li>
<li><p><strong>Execution</strong>:</p>
<ul>
<li><p>The orchestrator identifies the intent (weather and time).</p>
</li>
<li><p>Delegates tasks to the respective agents.</p>
</li>
<li><p>Combines responses to provide the final answer.</p>
</li>
</ul>
</li>
</ol>
<p><strong>Example Output</strong>:</p>
<ul>
<li><em>"The weather in Bangalore is misty with a temperature of 22°C. The current time in Bangalore is 12:27 AM."</em></li>
</ul>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> agents.base_agent <span class="hljs-keyword">import</span> Agent
<span class="hljs-keyword">from</span> tools.weather_tool <span class="hljs-keyword">import</span> WeatherTool
<span class="hljs-keyword">from</span> tools.time_tool <span class="hljs-keyword">import</span> TimeTool
<span class="hljs-keyword">from</span> orchestrator <span class="hljs-keyword">import</span> AgentOrchestrator

<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv
<span class="hljs-keyword">import</span> os

<span class="hljs-comment"># Load environment variables from .env file</span>
load_dotenv()

<span class="hljs-comment"># Create Weather Agent</span>
weather_agent = Agent(
    Name=<span class="hljs-string">"Weather Agent"</span>,
    Description=<span class="hljs-string">"Provides weather information for a given location"</span>,
    Tools=[WeatherTool()],
    Model=<span class="hljs-string">"gpt-4o-mini"</span>
)

<span class="hljs-comment"># Create Time Agent</span>
time_agent = Agent(
    Name=<span class="hljs-string">"Time Agent"</span>,
    Description=<span class="hljs-string">"Provides the current time for a given city"</span>,
    Tools=[TimeTool()],
    Model=<span class="hljs-string">"gpt-4o-mini"</span>
)

<span class="hljs-comment"># Create AgentOrchestrator</span>
agent_orchestrator = AgentOrchestrator([weather_agent, time_agent])

<span class="hljs-comment"># Run the orchestrator</span>
agent_orchestrator.run()
</code></pre>
<hr />
<h3 id="heading-whats-next"><strong>What’s Next?</strong></h3>
<p>This orchestrator is just the beginning. You can:</p>
<ul>
<li><p>Add more agents for tasks like translation, currency conversion, or database queries.</p>
</li>
<li><p>Optimize prompts for better LLM responses.</p>
</li>
<li><p>Extend the system for real-world applications like customer support or smart assistants.</p>
</li>
</ul>
<hr />
<h3 id="heading-full-code">Full Code</h3>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://github.com/zahere-dev/augmate">https://github.com/zahere-dev/augmate</a></div>
<p> </p>
<h3 id="heading-final-thoughts"><strong>Final Thoughts</strong></h3>
<p>Building a Multi-Agent Orchestrator showcases the power of combining LLMs with task-specific agents. By modularizing tasks and leveraging context effectively, you can create systems that are both scalable and intelligent.</p>
<p>Stay tuned for more updates, and feel free to share your thoughts or ask questions in the comments below. Don’t forget to check out the accompanying video for a detailed walkthrough of the code.</p>
<p>Happy coding! 🚀</p>
]]></content:encoded></item><item><title><![CDATA[Scrape Websites with Natural Language Prompts Using OpenAI Swarm Agents!]]></title><description><![CDATA[Video
https://www.youtube.com/watch?v=4y8j-LD8IK0
 
The Goal

Welcome to today's post, where we'll dive into building a web scraping agent using Firecrawl, an open-source library with a paid service option.
This guide will show you how to use FireCra...]]></description><link>https://zahere.com/scrape-websites-with-natural-language-prompts-using-openai-swarm-agents</link><guid isPermaLink="true">https://zahere.com/scrape-websites-with-natural-language-prompts-using-openai-swarm-agents</guid><category><![CDATA[generative ai]]></category><category><![CDATA[#agent]]></category><category><![CDATA[Swarm]]></category><category><![CDATA[Scraping]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Mon, 04 Nov 2024 17:33:40 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1730741242002/00b6e3c9-8039-4c55-97b6-edc2c524fd3b.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-video">Video</h2>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=4y8j-LD8IK0">https://www.youtube.com/watch?v=4y8j-LD8IK0</a></div>
<p> </p>
<h2 id="heading-the-goal">The Goal</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1730741345457/d3acb682-c3c4-4c1e-9e05-e0a8771ebcce.png" alt class="image--center mx-auto" /></p>
<p>Welcome to today's post, where we'll dive into building a web scraping agent using Firecrawl, an open-source library with a paid service option.</p>
<p>This guide will show you how to use FireCrawl to create a scraping agent that takes a URL and custom instructions, then extracts specific content and outputs it in a structured format.</p>
<h3 id="heading-why-firecrawl">Why Firecrawl?</h3>
<p>Traditional web scraping methods rely on manually defining XPath selectors or CSS selectors, which often requires detailed knowledge of a webpage’s structure.</p>
<p><a target="_blank" href="https://github.com/mendableai/firecrawl/tree/main">Firecrawl</a>, however, allows us to leverage natural language instructions to scrape data, making it easier to gather specific elements from a page without extensive coding.</p>
<p>This post will demonstrate Firecrawl’s powerful abstraction over conventional scraping techniques by integrating it with a simple agent interface.</p>
<h3 id="heading-disclaimer">Disclaimer</h3>
<p><em>This example uses the</em> <a target="_blank" href="https://books.toscrape.com/"><em>Books to Scrape website</em></a> <em>—a practice site designed for web scraping.</em></p>
<p><em>Please always ensure that scraping is allowed by a site’s terms of service, especially with commercial websites.</em></p>
<h2 id="heading-code-walkthrough">Code Walkthrough</h2>
<p><a target="_blank" href="https://colab.research.google.com/drive/1Fd39Q0oukKrIJNyzN1ICX9L6QzIEOni-">https://colab.research.google.com/drive/1Fd39Q0oukKrIJNyzN1ICX9L6QzIEOni-</a></p>
<h3 id="heading-high-level-agent-workflow">High-Level Agent Workflow</h3>
<p>Our scraping agent comprises two main components:</p>
<ol>
<li><p><strong>User Interface Agent</strong> – This component processes user queries, confirms instructions, and determines if additional information is needed.</p>
</li>
<li><p><strong>Scraper Agent</strong> – This agent handles the scraping process by fetching content from the given URL, parsing it based on the user’s instructions, and outputting the results in a structured format.</p>
</li>
</ol>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> google.colab <span class="hljs-keyword">import</span> userdata
<span class="hljs-keyword">import</span> json
<span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd
<span class="hljs-keyword">from</span> io <span class="hljs-keyword">import</span> StringIO
<span class="hljs-keyword">from</span> openai <span class="hljs-keyword">import</span> OpenAI
<span class="hljs-keyword">from</span> swarm <span class="hljs-keyword">import</span> Agent
<span class="hljs-keyword">from</span> swarm.repl <span class="hljs-keyword">import</span> run_demo_loop
<span class="hljs-keyword">from</span> firecrawl <span class="hljs-keyword">import</span> FirecrawlApp
<span class="hljs-keyword">import</span> nest_asyncio <span class="hljs-comment"># required for notebooks</span>
nest_asyncio.apply()

os.environ[<span class="hljs-string">'OPENAI_API_KEY'</span>] = userdata.get(<span class="hljs-string">"OPENAI_API_KEY"</span>)
os.environ[<span class="hljs-string">'FIRECRAWL_API_KEY'</span>] = userdata.get(<span class="hljs-string">"FIRECRAWL_API_KEY"</span>)

client = OpenAI(api_key=os.getenv(<span class="hljs-string">"OPENAI_API_KEY"</span>))

<span class="hljs-comment"># Initialize FirecrawlApp and OpenAI</span>
app = FirecrawlApp(api_key=os.getenv(<span class="hljs-string">"FIRECRAWL_API_KEY"</span>))
client = OpenAI(api_key=os.getenv(<span class="hljs-string">"OPENAI_API_KEY"</span>))

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">scrape_website</span>(<span class="hljs-params">url</span>):</span>
    <span class="hljs-string">"""Scrape a website using Firecrawl."""</span>
    scrape_status = app.scrape_url(
        url,
        params={<span class="hljs-string">'formats'</span>: [<span class="hljs-string">'markdown'</span>]}
    )
    <span class="hljs-keyword">return</span> scrape_status

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_completion</span>(<span class="hljs-params">role, task, content</span>):</span>
    <span class="hljs-string">"""Generate a completion using OpenAI."""</span>
    response = client.chat.completions.create(
        model=<span class="hljs-string">"gpt-4o-mini"</span>,
        messages=[
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">f"You are a <span class="hljs-subst">{role}</span>. <span class="hljs-subst">{task}</span>"</span>},
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: content}
        ]
    )
    <span class="hljs-keyword">return</span> response.choices[<span class="hljs-number">0</span>].message.content

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">json_to_csv_downloadable</span>(<span class="hljs-params">json_data, filename=<span class="hljs-string">"output.csv"</span></span>):</span>
    <span class="hljs-string">"""Takes JSON data and converts to csv"""</span>
    <span class="hljs-comment"># Parse the JSON data if it's in string format</span>
    <span class="hljs-keyword">if</span> isinstance(json_data, str):
        json_data = json.loads(json_data)
    <span class="hljs-comment"># Convert JSON data to a DataFrame</span>
    df = pd.DataFrame(json_data)
    <span class="hljs-comment"># Save the DataFrame to CSV</span>
    <span class="hljs-comment"># Display the DataFrame as CSV</span>
    csv_data = df.to_csv(index=<span class="hljs-literal">False</span>)
    display(csv_data)  <span class="hljs-comment"># This will show CSV data as plain text </span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handoff_to_parser</span>():</span>
    <span class="hljs-string">"""Hand off the website content to the parser agent."""</span>
    <span class="hljs-keyword">return</span> parser_agent

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handoff_to_csv_writer</span>():</span>
    <span class="hljs-string">"""Hand off the parsed content to csv writer"""</span>
    <span class="hljs-keyword">return</span> json_to_csv_downloadable

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handoff_to_website_scraper</span>():</span>
    <span class="hljs-string">"""Hand off the url to the website scraper agent."""</span>
    <span class="hljs-keyword">return</span> website_scraper_agent

user_interface_agent = Agent(
    name=<span class="hljs-string">"User Interface Agent"</span>,
    model=<span class="hljs-string">"gpt-4o-mini"</span>,
    instructions=<span class="hljs-string">"You are a user interface agent that handles all interactions with the user. You need to always start with a URL that the user wants to extract content from. The user will expect specific content to be extracted. Ask clarification questions if needed. Be concise."</span>,
    functions=[handoff_to_website_scraper],
)

website_scraper_agent = Agent(
    name=<span class="hljs-string">"Website Scraper Agent"</span>,
    instructions=<span class="hljs-string">"You are a website scraper agent specialized in scraping website content. If the scraped content is valid, handoff to csv writer"</span>,
    functions=[scrape_website,json_to_csv_downloadable],
)
</code></pre>
<p>run_demo_loop method in Swarm allows use to run an agent in a loop. In this case, it is the interface agent.</p>
<pre><code class="lang-python"><span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    <span class="hljs-comment"># Run the demo loop with the user interface agent</span>
    run_demo_loop(user_interface_agent, stream=<span class="hljs-literal">True</span>)
</code></pre>
<h2 id="heading-talk-to-the-agent">Talk to the Agent</h2>
<p>Let’s run the code and give it the following instruction</p>
<pre><code class="lang-python">get title <span class="hljs-keyword">and</span> price <span class="hljs-keyword">from</span> the top <span class="hljs-number">2</span> products <span class="hljs-keyword">in</span> https://books.toscrape.com/
</code></pre>
<p>The agent return the following response</p>
<pre><code class="lang-python">User: get title <span class="hljs-keyword">and</span> price <span class="hljs-keyword">from</span> the top <span class="hljs-number">2</span> products <span class="hljs-keyword">in</span> https://books.toscrape.com/ 
User Interface Agent: handoff_to_website_scraper()
Website Scraper Agent: scrape_website()
Website Scraper Agent: json_to_csv_downloadable()
title,price\nA Light <span class="hljs-keyword">in</span> the Attic,£<span class="hljs-number">51.77</span>\nTipping the Velvet,£<span class="hljs-number">53.74</span>\n
Website Scraper Agent: The title <span class="hljs-keyword">and</span> price of the top <span class="hljs-number">2</span> books <span class="hljs-keyword">from</span> the website have been extracted <span class="hljs-keyword">and</span> formatted into a CSV file named `top_2_books.csv`. You can download it <span class="hljs-keyword">from</span> the file management system.
</code></pre>
<h3 id="heading-advanced-instructions-with-firecrawl">Advanced Instructions with FireCrawl</h3>
<p>Firecrawl shines when handling more complex queries.</p>
<p>Here’s how you can fetch additional details, such as the product description, from individual book pages.</p>
<p>By providing natural language instructions, you can automate navigation through categories and fetch granular data without manually defining selectors.</p>
<pre><code class="lang-python">User: Get the top <span class="hljs-number">2</span> products <span class="hljs-keyword">from</span> the Philosophy category
</code></pre>
<p>Response.  </p>
<pre><code class="lang-python">Website Scraper Agent: Here are the product descriptions <span class="hljs-keyword">for</span> the top <span class="hljs-number">2</span> Philosophy books:

<span class="hljs-number">1.</span> **Sophie<span class="hljs-string">'s World**
   - **Description**: A page-turning novel that is also an exploration of the great philosophical concepts of Western thought, Sophie’s World has fired the imagination of readers all over the world, with more than twenty million copies in print. One day fourteen-year-old Sophie Amundsen comes home from school to find in her mailbox two notes, with one question on each: “Who are you?” and “Where does the world come from?” From that irresistible beginning, Sophie becomes obsessed with questions that take her far beyond what she knows of her Norwegian village. Through those letters, she enrolls in a kind of correspondence course, covering Socrates to Sartre, with a mysterious philosopher, while receiving letters addressed to another girl. Who is Hilde? And why does her mail keep turning up? To unravel this riddle, Sophie must use the philosophy she is learning—but the truth turns out to be far more complicated than she could have imagined.
   - **URL**: [Sophie'</span>s World](https://books.toscrape.com/catalogue/sophies-world_966/index.html)

<span class="hljs-number">2.</span> **The Death of Humanity: <span class="hljs-keyword">and</span> the Case <span class="hljs-keyword">for</span> Life**
   - **Description**: Do you believe human life <span class="hljs-keyword">is</span> inherently valuable? Unfortunately, <span class="hljs-keyword">in</span> the secularized age of state-sanctioned euthanasia <span class="hljs-keyword">and</span> abortion-on-demand, many are losing faith <span class="hljs-keyword">in</span> the simple value of human life. To the disillusioned, human beings are a cosmic accident whose intrinsic value <span class="hljs-keyword">is</span> worth no more than other animals. The Death of Humanity explores our culture<span class="hljs-string">'s declining respect for the sanctity of human life, drawing on philosophy and history to reveal the dark road ahead for society if we lose our faith in human life.
   - **URL**: [The Death of Humanity](https://books.toscrape.com/catalogue/the-death-of-humanity-and-the-case-for-life_932/index.html)</span>
</code></pre>
<h3 id="heading-the-benefits-of-llm-based-scraping">The Benefits of LLM-Based Scraping</h3>
<p>Prior to Large Language Models (LLMs), scraping required precise coding to navigate page structures. LLM integration within tools like Firecrawl reduces this complexity, letting us use natural language commands to scrape data.</p>
<p>This feature enables flexible, conversational interactions with a webpage, significantly reducing the manual effort.</p>
<p>Thank you for following along with this tutorial! If you enjoyed this guide, please share it with others interested in web scraping and stay tuned for a comparison between <strong>FireCrawl</strong> and <strong>CrowdAI</strong> in the next post!</p>
]]></content:encoded></item><item><title><![CDATA[Claude's 'Computer Use' Put to the Test: 5 Challenges and Insights]]></title><description><![CDATA[Today, I set out to challenge Claude’s 'Computer Use' with five specific tasks, each designed to evaluate its precision, adaptability, and overall functionality.
What is Claude’s ‘Computer Use’?
Claude’s Sonnet 3.5 can now interact with the OS. It ca...]]></description><link>https://zahere.com/claudes-computer-use-put-to-the-test-5-challenges-and-insights</link><guid isPermaLink="true">https://zahere.com/claudes-computer-use-put-to-the-test-5-challenges-and-insights</guid><category><![CDATA[generative ai]]></category><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[#Anthropic #AI #ComputerUse #Technology #MachineLearning #Innovation #Automation #AIFuture #DigitalTransformation #PublicBeta]]></category><category><![CDATA[claude.ai]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Sun, 27 Oct 2024 09:22:53 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1730020702357/dbcc0049-11ee-4b00-a7a8-2bc5010e9d6d.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Today, I set out to challenge Claude’s 'Computer Use' with five specific tasks, each designed to evaluate its precision, adaptability, and overall functionality.</p>
<h2 id="heading-what-is-claudes-computer-use">What is Claude’s ‘Computer Use’?</h2>
<p>Claude’s Sonnet 3.5 can now interact with the OS. It can perform task like you would - Browsing websites using a Web Browser, using excel sheet to fill data, etc.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1730020126511/c45f5aee-3e03-46e2-90b9-bde0b49a45b1.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-the-goal-of-this-article">The goal of this Article</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1730017630836/503998d8-d4c4-46f0-8ba2-b133efce0387.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-installation">Installation</h2>
<p>A detailed explanation of running the code is available in this repo.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo">https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo</a></div>
<p> </p>
<p>We need a Linux distribution to run ‘Computer Use’ in an isolated environment (Docker).</p>
<p>To install in Linux subsystem on Windows, run the below command in CMD.</p>
<p>Follow this <a target="_blank" href="https://learn.microsoft.com/en-us/windows/wsl/install">link</a> for more details.</p>
<pre><code class="lang-python">wsl --install
</code></pre>
<p>You will need a anthropic API key. You can get that from here.</p>
<p>Start docker and run the the below snippet in CMD/bash/PowerShell.</p>
<pre><code class="lang-bash">SET ANTHROPIC_API_KEY=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
docker run ^
    -e ANTHROPIC_API_KEY=%ANTHROPIC_API_KEY% ^
    -v %USERPROFILE%\.anthropic:/home/computeruse/.anthropic ^
    -p 5900:5900 ^
    -p 8501:8501 ^
    -p 6080:6080 ^
    -p 8080:8080 ^
    -it ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest
</code></pre>
<p>After all the installations, you should see the agent running in localhost:8080</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1730018304901/9b2a2daf-aba9-434f-855f-e49efa36618e.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-challenge-1-extracting-article-titles-from-my-newsletter">Challenge 1: Extracting Article Titles from My Newsletter</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1730019203346/35979cb8-4162-48df-a4ba-49ab9826a040.png" alt class="image--center mx-auto" /></p>
<p>The first task aimed to test the agent's ability to interact with a webpage and extract data. I provided the agent with a URL to my newsletter, asking it to capture all article titles and store them in an Excel sheet. This sounds simple, but it required the agent to navigate a few hurdles:</p>
<ol>
<li><p><strong>Navigating the Website</strong>: The agent successfully opened the URL in Firefox and accessed the page.</p>
</li>
<li><p><strong>Handling a Subscription Page</strong>: It encountered a subscription page and asked for input on how to proceed. After selecting the public archive, it moved forward.</p>
</li>
<li><p><strong>Using RSS Feeds</strong>: The agent smartly identified an RSS feed on the page and used it to extract article titles.</p>
</li>
</ol>
<p>The task was completed, and the agent saved around 20 articles in an Excel file. While it wasn’t a straightforward front-end automation use case, the agent demonstrated adaptability in handling unexpected scenarios.</p>
<h2 id="heading-challenge-2-finding-the-latest-video-on-my-youtube-channel">Challenge 2: Finding the Latest Video on My YouTube Channel</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1730019328474/8e8360af-e017-4f2d-a653-96c971cdc1ef.png" alt class="image--center mx-auto" /></p>
<p>Next, I asked the agent to find the latest video from my YouTube channel. Here’s how it went:</p>
<ol>
<li><p><strong>Channel Search</strong>: The agent navigated to YouTube and searched for my channel, "Zahin Tab."</p>
</li>
<li><p><strong>Fetching the Latest Video</strong>: It successfully identified my latest video titled <em>"I Created a Blogging Agent in 5 Minutes Using OpenAI's Form"</em>.</p>
</li>
</ol>
<h2 id="heading-challenge-3-find-the-number-of-likes-in-the-video-from-task-2">Challenge 3: Find the number of likes in the video (from Task 2)</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1730019425939/84b424c5-bd9b-4c7f-9ef6-8a6016631523.png" alt class="image--center mx-auto" /></p>
<p><strong>Extracting Likes</strong>: I then asked the agent to find the number of likes on that video, and it efficiently navigated the page to retrieve this data.</p>
<p>The agent performed well here, understanding my commands even when they weren't perfectly specific. This task demonstrated its ability to interpret and execute instructions with some degree of context awareness.</p>
<h2 id="heading-challenge-4-finding-similar-videos-under-4-minutes">Challenge 4: Finding Similar Videos under 4 Minutes</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1730019505460/506bae91-6d6f-4ec1-b062-4b469c36c35c.png" alt class="image--center mx-auto" /></p>
<p>This task was more complex and required the agent to not only search for videos but also filter results by duration. Here’s the breakdown:</p>
<ol>
<li><p><strong>Understanding the Prompt</strong>: I asked the agent to find videos similar to my latest one, but with a duration of under 4 minutes.</p>
</li>
<li><p><strong>Using Filters</strong>: The agent used the search bar to enter the query, applied a duration filter, and selected the "Under 4 minutes" option.</p>
</li>
<li><p><strong>Evaluating Results</strong>: It was able to find a few similar videos, although some of the results didn’t exactly match the duration criteria.</p>
</li>
</ol>
<p>This task highlighted the potential for LLMs to automate more complex workflows by combining search, filters, and contextual understanding, making them highly useful in automation scenarios.</p>
<h2 id="heading-challenge-5-automating-form-filling">Challenge 5: Automating Form Filling</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1730019575708/48f296ec-26f4-4dd1-becd-37e6ce858b11.png" alt class="image--center mx-auto" /></p>
<p>One of the most common use cases in the RPA (Robotic Process Automation) industry is automating form filling. Here’s how the agent performed:</p>
<ol>
<li><p><strong>Accessing the Form</strong>: I provided a URL to a dummy form from RoboForm, a password manager.</p>
</li>
<li><p><strong>Filling in Data</strong>: The agent quickly identified the form fields and fill, first name, and last name.</p>
</li>
<li><p><strong>Handling Errors</strong>: Midway through, I encountered some technical issues and rate limits, which required restarting the application.</p>
</li>
</ol>
<p>This task demonstrated the potential of using LLMs for repetitive tasks like filling out forms, which could significantly save time in enterprise environments.</p>
<h2 id="heading-the-bigger-picture-andrej-karpathys-vision-of-llms-and-automation">The Bigger Picture: Andrej Karpathy's Vision of LLMs and Automation</h2>
<p>Reflecting on these experiments, I couldn’t help but think about Andrej Karpathy's vision for the future of LLMs—essentially envisioning an "LLMOS" (Large Language Model Operating System).</p>
<p>Imagine an LLM that can access OS-level functions, with unlimited memory and a singular focus on completing specific tasks. Here’s what that could look like in practice:</p>
<ul>
<li><p><strong>Intelligent Decision-Making</strong>: LLMs trained on specific datasets could make small decisions autonomously, handling routine tasks without human intervention.</p>
</li>
<li><p><strong>Enterprise Integration</strong>: In a business setting, this could mean automating tasks like customer support, form filling, or data entry, making processes more efficient.</p>
</li>
</ul>
<p>This vision excites me as someone who has been in the automation space for a long time, and it shows how the lines between front-end automation and intelligent decision-making are beginning to blur.</p>
<h2 id="heading-conclusion-the-future-of-front-end-automation-with-llms">Conclusion: The Future of Front-End Automation with LLMs</h2>
<p>These experiments were a glimpse into the future of AI-driven automation. From extracting data to interacting with websites and filling out forms, LLMs are proving to be versatile tools. While there are challenges like rate limits and technical hiccups, the potential for transforming enterprise automation is immense.</p>
<p>If you found this exploration interesting and want to learn more about the intersection of AI and automation, don’t forget to subscribe to my YouTube channel. I regularly share insights and tutorials on how to leverage AI tools effectively.</p>
<p>See you in the next post!</p>
]]></content:encoded></item><item><title><![CDATA[Building a Technical Content Writing Agent Using Swarm: A Step-by-Step Guide]]></title><description><![CDATA[In this blog post, we’ll delve into a new multi-agent framework called Swarm, developed by OpenAI, which has recently garnered significant attention within the AI community.
What we are building

The goal is to explore how to use Swarm by building a ...]]></description><link>https://zahere.com/building-a-technical-content-writing-agent-using-swarm-a-step-by-step-guide</link><guid isPermaLink="true">https://zahere.com/building-a-technical-content-writing-agent-using-swarm-a-step-by-step-guide</guid><category><![CDATA[#agent]]></category><category><![CDATA[aiagents]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[openai]]></category><category><![CDATA[Swarm]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Mon, 21 Oct 2024 02:45:04 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1729478638963/c0bf778b-6f73-4ab4-ac74-08098c55cb4b.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this blog post, we’ll delve into a new multi-agent framework called <em>Swarm</em>, developed by OpenAI, which has recently garnered significant attention within the AI community.</p>
<h2 id="heading-what-we-are-building">What we are building</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1729477068550/527a09d8-3f2a-46aa-aa5a-f90f4330f574.png" alt class="image--center mx-auto" /></p>
<p>The goal is to explore how to use Swarm by building a simple agent that takes a user query, performs detailed research, and generates a structured blog post based on the findings.</p>
<p>This will help you understand the capabilities of Swarm, especially in orchestrating tasks among multiple agents.</p>
<h2 id="heading-what-is-swarm">What is Swarm?</h2>
<p>Swarm is a multi-agent orchestration framework introduced by OpenAI. Although currently labeled as experimental and educational, it has quickly become a favorite among developers, earning praise as one of the best frameworks for building and coordinating multi-agent systems.</p>
<h3 id="heading-key-concepts-of-the-swarm-framework">Key Concepts of the Swarm Framework</h3>
<p>Swarm's design is centered around two fundamental components:</p>
<p><strong>Agents</strong>: These are autonomous units that operate with a set of instructions and make decisions independently.</p>
<p>Each agent can specialize in a particular task, such as gathering data or writing content.</p>
<pre><code class="lang-python">agent = Agent(
   instructions=<span class="hljs-string">"You are a helpful agent."</span>
)
</code></pre>
<p><strong>Handoffs</strong>: This mechanism allows one agent to transfer control of a task or conversation to another agent.</p>
<p>It enables seamless collaboration between agents, ensuring smooth execution of complex workflows.</p>
<pre><code class="lang-python">sales_agent = Agent(name=<span class="hljs-string">"Sales Agent"</span>)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handoff_to_sales</span>():</span>
   <span class="hljs-keyword">return</span> sales_agent

agent = Agent(functions=[transfer_to_sales])

response = client.run(agent, [{<span class="hljs-string">"role"</span>:<span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>:<span class="hljs-string">"Transfer me to sales."</span>}])
print(response.agent.name)
</code></pre>
<p>By leveraging these components, Swarm helps developers create systems where tasks can be broken down into smaller, specialized sub-tasks, handled by different agents.</p>
<h2 id="heading-use-case-creating-a-technical-content-writing-agent">Use Case: Creating a Technical Content Writing Agent</h2>
<p>Code: <a target="_blank" href="https://colab.research.google.com/drive/1phDFUasrZxjChabWo_oWwWuNKwfjuJ_B?authuser=1#scrollTo=zyxmU_W9ZCxp">https://colab.research.google.com/drive/1phDFUasrZxjChabWo_oWwWuNKwfjuJ_B?authuser=1#scrollTo=zyxmU_W9ZCxp</a></p>
<p>To illustrate the potential of Swarm, let’s build a multi-agent system that generates a blog post based on user input. The system comprises three agents:</p>
<ol>
<li><p><strong>Interface Agent</strong>: Interacts with the user, refines the query if needed, and passes the query to the Researcher Agent.</p>
</li>
<li><p><strong>Researcher Agent</strong>: Conducts detailed research on the query and prepares a research report.</p>
</li>
<li><p><strong>Blogger Agent</strong>: Uses the research report to create a structured blog post.</p>
</li>
</ol>
<p>The workflow is simple: the user provides a theme or query, such as "Top 5 Technical Skills for 2025." The Interface Agent refines the query if necessary, then hands it over to the Researcher Agent, which gathers and analyzes relevant information. Finally, the Blogger Agent compiles the research into a well-written blog post.</p>
<h2 id="heading-how-the-swarm-framework-works">How the Swarm Framework Works</h2>
<h3 id="heading-illustration-overview">Illustration Overview</h3>
<p>Below is a high-level overview of the agent workflow:</p>
<ul>
<li><p><strong>User Input</strong>: The user provides a theme, e.g., "Why is R fast?".</p>
</li>
<li><p><strong>Interface Agent</strong>: Takes the user’s input, asks clarifying questions if needed, and then passes the refined query to the Researcher Agent.</p>
</li>
<li><p><strong>Researcher Agent</strong>: Gathers data from search engines, scrapes content from top results, and analyzes the information to produce a research report.</p>
</li>
<li><p><strong>Blogger Agent</strong>: Converts the research report into a blog post and returns it to the Interface Agent.</p>
</li>
<li><p><strong>Interface Agent</strong>: Delivers the completed blog post to the user.</p>
</li>
</ul>
<p>This setup demonstrates how Swarm’s agent-based architecture simplifies the process of delegating tasks and orchestrating their execution.</p>
<h3 id="heading-building-the-swarm-agents">Building the Swarm Agents</h3>
<p>Let’s dive into the code, which consists of the three agents and their respective functions.</p>
<h4 id="heading-agent-class-structure">Agent Class Structure</h4>
<p>Each agent is defined as a class with parameters like <code>name</code>, <code>model</code>, <code>instructions</code>, and <code>functions</code>. The functions include methods for interacting with APIs, databases, and other agents. A key method is the <code>handoff</code> method, which allows one agent to pass control to another:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Field</td><td>Type</td><td>Description</td><td>Default</td></tr>
</thead>
<tbody>
<tr>
<td><strong>name</strong></td><td><code>str</code></td><td>The name of the agent.</td><td><code>"Agent"</code></td></tr>
<tr>
<td><strong>model</strong></td><td><code>str</code></td><td>The model to be used by the agent.</td><td><code>"gpt-4o"</code></td></tr>
<tr>
<td><strong>instructions</strong></td><td><code>str</code> or <code>func() -&gt; str</code></td><td>Instructions for the agent, can be a string or a callable returning a string.</td><td><code>"You are a helpful agent."</code></td></tr>
<tr>
<td><strong>functions</strong></td><td><code>List</code></td><td>A list of functions that the agent can call.</td><td><code>[]</code></td></tr>
<tr>
<td><strong>tool_choice</strong></td><td><code>str</code></td><td>The tool choice for the agent, if any.</td><td><code>None</code></td></tr>
</tbody>
</table>
</div><h4 id="heading-researcher-agent">Researcher Agent</h4>
<p>The Researcher Agent uses a package called <code>GPT Researcher</code> to gather and analyze content from various search engines.</p>
<p>This agent performs a deep dive into the topic provided by the user</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> nest_asyncio <span class="hljs-comment"># required for notebooks</span>
nest_asyncio.apply()

<span class="hljs-keyword">from</span> gpt_researcher <span class="hljs-keyword">import</span> GPTResearcher
<span class="hljs-keyword">import</span> asyncio

client = OpenAI(api_key=os.getenv(<span class="hljs-string">"OPENAI_API_KEY"</span>))

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_report</span>(<span class="hljs-params">query: str</span>) -&gt; str:</span>
    report_type = <span class="hljs-string">"research_report"</span>
    researcher = GPTResearcher(query, report_type)
    research_result =  <span class="hljs-keyword">await</span> researcher.conduct_research()
    report =  <span class="hljs-keyword">await</span> researcher.write_report()

    <span class="hljs-comment"># Get additional information</span>
    research_context = researcher.get_research_context()
    research_costs = researcher.get_costs()
    research_images = researcher.get_research_images()
    research_sources = researcher.get_research_sources()

    <span class="hljs-keyword">return</span> {<span class="hljs-string">'report'</span>:report}

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">research_topic</span>(<span class="hljs-params">query: str</span>) -&gt; str:</span>
  <span class="hljs-string">"""Generate research report"""</span>
  <span class="hljs-keyword">return</span> asyncio.run(get_report(query))

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handoff_to_researcher</span>():</span>
    <span class="hljs-string">"""Hand off the user query to the researcher agent."""</span>
    print(<span class="hljs-string">"Handing off to Researcher Agent"</span>)
    <span class="hljs-keyword">return</span> researcher_agent

researcher_agent = Agent(
    name=<span class="hljs-string">"Researcher Agent"</span>,
    model=<span class="hljs-string">"gpt-4o-mini"</span>,
    instructions=<span class="hljs-string">"You are a researcher agent specialized in researching. If you are satisfied with the research, handoff the report to blogger"</span>,
    functions=[research_topic, handoff_to_blogger],
)
</code></pre>
<p>The <code>handoff</code> method transfers control to the Blogger Agent once the research is complete.</p>
<h4 id="heading-3-blogger-agent">3. Blogger Agent</h4>
<p>The Blogger Agent creates a blog post based on the research report. It uses the language model to generate a well-structured article:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_completion</span>(<span class="hljs-params">role, task, content</span>):</span>
    <span class="hljs-string">"""Generate a completion using OpenAI."""</span>
    response = client.chat.completions.create(
        model=<span class="hljs-string">"gpt-4o-mini"</span>,
        messages=[
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">f"You are a <span class="hljs-subst">{role}</span>. <span class="hljs-subst">{task}</span>"</span>},
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: content}
        ]
    )
    <span class="hljs-keyword">return</span> response.choices[<span class="hljs-number">0</span>].message.content

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handoff_to_blogger</span>():</span>
    <span class="hljs-string">"""Hand off the research report to the blogger agent."""</span>
    print(<span class="hljs-string">"Handing off to Blogger Agent"</span>)
    <span class="hljs-keyword">return</span> blogger_agent

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">generate_blog_content</span>(<span class="hljs-params">research_data</span>):</span>
    <span class="hljs-string">"""Generate technical blog content on research report using OpenAI."""</span>
    content = generate_completion(
        <span class="hljs-string">"Technical Content Creator"</span>,
        <span class="hljs-string">"Create compelling technical content for a blog based on the following research report."</span>,
        research_data
    )
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"content"</span>: content}


blogger_agent = Agent(
    name=<span class="hljs-string">"Blogger Agent"</span>,
    model=<span class="hljs-string">"gpt-4o-mini"</span>,
    instructions=<span class="hljs-string">"You are a top technical blogger agent specialized in creating compelling technical content for blogs based on research report. Be concise."</span>,
    functions=[generate_blog_content],
)
</code></pre>
<h4 id="heading-4-interface-agent">4. Interface Agent</h4>
<p>The Interface Agent communicates with the user, refines the query, and then initiates the workflow by handing control over to the Researcher Agent:</p>
<pre><code class="lang-python">user_interface_agent = Agent(
    name=<span class="hljs-string">"User Interface Agent"</span>,
    model=<span class="hljs-string">"gpt-4o-mini"</span>,
    instructions=<span class="hljs-string">"You are a user interface agent that handles all interactions with the user. You need to always start with a theme or topic that the user wants to research. Ask clarification questions if needed. Be concise."</span>,
    functions=[handoff_to_researcher],
)
</code></pre>
<h3 id="heading-running-the-system">Running the System</h3>
<p>With the agents defined, you can now run the system:</p>
<ol>
<li><p>Install the necessary packages:</p>
<pre><code class="lang-python"> ! pip install git+https://github.com/openai/swarm.git openai gpt-researcher nest_asyncio
</code></pre>
</li>
<li><p>Initialize the agents and run the user input loop:</p>
<pre><code class="lang-plaintext"> import os
 from google.colab import userdata
 from openai import OpenAI
 from swarm import Agent
 from swarm.repl import run_demo_loop

 os.environ['OPENAI_API_KEY'] = userdata.get("OPENAI_API_KEY")
 os.environ['TAVILY_API_KEY'] = userdata.get("TAVILY_KEY")
 FAST_LLM="openai:gpt-4o-mini"
 os.environ['FAST_LLM'] = FAST_LLM
 os.environ['SMART_LLM'] = FAST_LLM
</code></pre>
<pre><code class="lang-python"> <span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
     <span class="hljs-comment"># Run the demo loop with the user interface agent</span>
     run_demo_loop(user_interface_agent, stream=<span class="hljs-literal">True</span>)
</code></pre>
</li>
</ol>
<p>For the theme "Top 5 Technical Skills for 2025," the system will generate a structured blog post by conducting research and formatting the findings into readable content.</p>
<h2 id="heading-example-output-blog-post-generated-by-the-agent">Example Output: Blog Post Generated by the Agent</h2>
<p>Here's an example of what the generated blog post might look like:</p>
<hr />
<h2 id="heading-essential-skills-for-software-developers-staying-competitive-in-2025"><em>Essential Skills for Software Developers: Staying Competitive in 2025</em></h2>
<p><em>As we approach 2025, the pace of technological advancement presents both opportunities and challenges for software developers. The landscape is evolving with new innovations in artificial intelligence, cloud computing, and data analytics, among others. To remain competitive and relevant in the job market, developers must cultivate certain technical skills that are poised to dominate the industry. Here’s a detailed look at the top five technical skills every software developer should focus on in the coming years.</em></p>
<h3 id="heading-1-mastering-artificial-intelligence-and-machine-learning"><em>1. Mastering Artificial Intelligence and Machine Learning</em></h3>
<p><em>Artificial Intelligence (AI) and Machine Learning (ML) are no longer just buzzwords; they are core components of modern software applications. Industries from healthcare to finance are leveraging AI to automate processes, analyze data, and enhance customer experiences.</em></p>
<p><strong><em>What to Learn:</em></strong></p>
<ul>
<li><p><strong><em>Programming languages:</em></strong> <em>Python is the go-to language for AI and ML, alongside familiarity with frameworks like TensorFlow and PyTorch.</em></p>
</li>
<li><p><strong><em>Key areas:</em></strong> <em>Focus on natural language processing (NLP) and computer vision, as they will continue to grow in demand.</em></p>
</li>
<li><p><strong><em>Real-world application:</em></strong> <em>Engage in projects that allow you to implement algorithms and create systems that learn and adapt based on data.</em></p>
</li>
</ul>
<h3 id="heading-2-embracing-cloud-computing-and-serverless-architecture"><em>2. Embracing Cloud Computing and Serverless Architecture</em></h3>
<p><em>The migration to cloud computing is reshaping how businesses operate. Proficiency in cloud platforms such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform is becoming a mandatory skill for developers. Furthermore, an understanding of serverless architecture, which eliminates the need to manage servers, is a differentiator that can elevate your career.</em></p>
<p><strong><em>What to Learn:</em></strong></p>
<ul>
<li><p><strong><em>Cloud platforms:</em></strong> <em>Gain hands-on experience with major cloud services and their offerings.</em></p>
</li>
<li><p><strong><em>Serverless frameworks:</em></strong> <em>Explore platforms like AWS Lambda to understand how to deploy applications more swiftly.</em></p>
</li>
<li><p><strong><em>Collaboration:</em></strong> <em>Familiarize yourself with DevOps practices to enhance collaboration in software development and operations.</em></p>
</li>
</ul>
<h3 id="heading-3-building-cybersecurity-fundamentals"><em>3. Building Cybersecurity Fundamentals</em></h3>
<p><em>With the surge in cyber threats, understanding cybersecurity is critical for developers. Knowledge of secure coding practices, data encryption, and vulnerability assessment is essential for protecting applications and sensitive user information.</em></p>
<p><strong><em>What to Learn:</em></strong></p>
<ul>
<li><p><strong><em>Security principles:</em></strong> <em>Study secure coding standards and tools for threat modeling and incident response.</em></p>
</li>
<li><p><strong><em>Sandbox testing:</em></strong> <em>Experiment with penetration testing frameworks to gain practical knowledge.</em></p>
</li>
<li><p><strong><em>Developing secure code:</em></strong> <em>Integrate security considerations into your day-to-day coding practices and workflow.</em></p>
</li>
</ul>
<h3 id="heading-4-capitalizing-on-low-code-and-no-code-development"><em>4. Capitalizing on Low-Code and No-Code Development</em></h3>
<p><em>Low-code and no-code platforms are transforming application development by making it accessible to non-developers. These platforms enable rapid application development, which can markedly accelerate a company’s digital transformation efforts.</em></p>
<p><strong><em>What to Learn:</em></strong></p>
<ul>
<li><p><strong><em>Tools and platforms:</em></strong> <em>Explore tools like OutSystems, Mendix, or Bubble to understand how they work and their capabilities.</em></p>
</li>
<li><p><strong><em>Integration skills:</em></strong> <em>Learn how to integrate low-code solutions with traditional back-end systems.</em></p>
</li>
<li><p><strong><em>Process optimization:</em></strong> <em>Understand how these platforms can streamline workflows and provide rapid iterations.</em></p>
</li>
</ul>
<h3 id="heading-5-diving-into-data-science-and-analytics"><em>5. Diving Into Data Science and Analytics</em></h3>
<p><em>In our data-centric world, the ability to analyze and interpret data is invaluable. Understanding data science principles can empower developers to turn insights into actionable strategies.</em></p>
<p><strong><em>What to Learn:</em></strong></p>
<ul>
<li><p><strong><em>Data tools and languages:</em></strong> <em>Gain proficiency in SQL, R, and Python for data manipulation and visualization.</em></p>
</li>
<li><p><strong><em>Statistical analysis:</em></strong> <em>Understand key concepts of statistics and machine learning to derive meaningful insights from data.</em></p>
</li>
<li><p><strong><em>Predictive analysis:</em></strong> <em>Use machine learning algorithms to forecast trends and contribute to data-driven decision-making processes.</em></p>
</li>
</ul>
<h2 id="heading-conclusion"><em>Conclusion</em></h2>
<p><em>As technology continues to change at a rapid pace, developers must engage in continuous learning to stay up to date. The top five skills outlined above—AI and ML, cloud computing and serverless architecture, cybersecurity fundamentals, low-code/no-code development, and data science—are critical for any developer aiming to thrive in the industry by 2025.</em></p>
<p><em>Investing in these skills will not only enhance your employability but also equip you to contribute effectively to innovative projects that shape the technology landscape. Embrace this opportunity to upgrade your expertise and solidify your place in the future of software development.</em></p>
<h2 id="heading-references"><em>References</em></h2>
<ul>
<li><p><em>Hadalgi, N. (2024). The Most In-Demand Programming Skills for 2025: Staying Ahead in a Rapidly Evolving Tech Landscape. LinkedIn.</em></p>
</li>
<li><p><em>Teal HQ. (2024). Top Skills for Software Developers in 2024 (+Most Underrated Skills). Teal HQ.</em></p>
</li>
</ul>
<p><em>Educative. (2023). Top Software Developer Skills To Learn in 2024. Educative.</em></p>
<hr />
<h2 id="heading-conclusion-1">Conclusion</h2>
<p>The Swarm framework simplifies the creation of complex multi-agent systems, allowing developers to break down intricate tasks into manageable components.</p>
<p>By combining specialized agents for tasks like research and content creation, developers can automate time-consuming processes, such as generating technical blog posts from user queries.</p>
<p>With this guide, you now have a foundational understanding of Swarm’s capabilities. The next step is to explore more advanced use cases, such as orchestrating agents for more complex workflows.</p>
<p>Stay tuned for future posts where we dive deeper into Swarm and its applications in AI development!</p>
]]></content:encoded></item><item><title><![CDATA[Parent Document Retrieval: Balancing Detail and Context for Complex Queries]]></title><description><![CDATA[Why Use Parent Document Retrieval?
Traditional RAG methods can struggle with intricate questions due to their reliance on smaller text segments that may not encapsulate the broader themes or details of the original documents.
When building a RAG-base...]]></description><link>https://zahere.com/parent-document-retrieval-balancing-detail-and-context-for-complex-queries</link><guid isPermaLink="true">https://zahere.com/parent-document-retrieval-balancing-detail-and-context-for-complex-queries</guid><category><![CDATA[RAG ]]></category><category><![CDATA[Retrieval-Augmented Generation]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Artificial Intelligence]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Sun, 06 Oct 2024 13:47:27 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1728222317327/6a8c7f84-3ed5-4548-9a05-c0deab279ce0.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-why-use-parent-document-retrieval"><strong>Why Use Parent Document Retrieval?</strong></h2>
<p>Traditional RAG methods can struggle with intricate questions due to their reliance on smaller text segments that may not encapsulate the broader themes or details of the original documents.</p>
<p>When building a RAG-based solution, answer the following questions before creating the indexing pipeline.</p>
<p><strong>What type of queries will the system handle?</strong></p>
<ul>
<li>Are the queries typically seeking specific details, or do they require a broader contextual understanding?</li>
</ul>
<p><strong>How important is precision versus context in the system’s responses?</strong></p>
<ul>
<li>Should the system prioritize precise answers to detailed questions (favoring smaller chunks), or should it provide more comprehensive responses even if they are less precise (favoring larger chunks)?</li>
</ul>
<p><strong>How much detail or noise is acceptable in the retrieved results?</strong></p>
<ul>
<li>Will smaller chunks provide too little context, or will larger chunks introduce too much irrelevant information?</li>
</ul>
<p><strong>Can the user’s query context vary significantly?</strong></p>
<ul>
<li>If the user queries tend to be more context-dependent, would using larger chunks improve understanding, or could the system risk missing key details?</li>
</ul>
<p><strong>What is the nature of the content being used for retrieval?</strong></p>
<ul>
<li>Does the content lend itself better to smaller chunks (e.g., factual, concise information), or does it require larger chunks to capture essential relationships and context (e.g., narrative or complex data)?</li>
</ul>
<h2 id="heading-what-is-parent-document-retrieval-pdr">What is Parent Document Retrieval (PDR)?</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728221980279/024ab00d-5b50-4218-a9c5-275208906c95.png" alt class="image--center mx-auto" /></p>
<p>Parent Document Retrieval (PDR) is a sophisticated technique utilized in Retrieval-Augmented Generation (RAG) models to enhance the retrieval process by obtaining full parent documents to augment the LLM generation.</p>
<p>This method addresses the limitations of standard RAG approaches that often rely on smaller text chunks, which may lack the necessary context for complex queries.</p>
<p>By retrieving complete parent documents, PDR allows for a more comprehensive understanding of the material, leading to richer and more informative responses, particularly for nuanced inquiries.</p>
<h2 id="heading-where-is-parent-document-retrieval-applied">Where is Parent Document Retrieval Applied?</h2>
<p>PDR is applicable in various domains where context-rich responses are essential. Some common applications include:</p>
<ul>
<li><p>Customer Support: Enhancing automated systems to provide detailed responses based on comprehensive product documentation.</p>
</li>
<li><p>Legal and Compliance: Assisting in retrieving relevant legal documents or regulations that require in-depth understanding.</p>
</li>
<li><p>Research and Academia: Facilitating access to full research papers or articles when specific sections are referenced.</p>
</li>
<li><p>Content Generation: Improving the quality of content produced by language models by providing them with extensive background information.</p>
</li>
</ul>
<h2 id="heading-when-should-parent-document-retrieval-be-used">When Should Parent Document Retrieval be Used?</h2>
<p>PDR should be employed particularly in scenarios where:</p>
<ul>
<li><p>The queries are complex or multifaceted, requiring detailed context.</p>
</li>
<li><p>The available data consists of lengthy documents that need to be segmented for better comprehension.</p>
</li>
<li><p>There is a need to ensure high accuracy and relevance in responses generated by language models.</p>
</li>
<li><p>Users seek comprehensive answers rather than brief snippets of information.</p>
</li>
</ul>
<p>Using PDR can significantly enhance the performance of RAG systems in these situations</p>
<h2 id="heading-who-benefits-from-parent-document-retrieval">Who Benefits from Parent Document Retrieval?</h2>
<p>Various stakeholders can benefit from PDR, including:</p>
<ul>
<li><p>Developers and Data Scientists: Those working on RAG systems can leverage PDR to improve model performance and user satisfaction.</p>
</li>
<li><p>Businesses: Organizations seeking efficient customer support solutions can enhance their automated systems with PDR.</p>
</li>
<li><p>Researchers and Academics: Individuals needing thorough literature reviews or data analysis can utilize PDR for more effective information retrieval.</p>
</li>
<li><p>End Users: Anyone seeking detailed and contextually rich information will benefit from systems employing PDR.</p>
</li>
</ul>
<hr />
<iframe src="https://newsletter.adaptiveengineer.com/embed" width="480" height="320" style="border:1px solid #EEE;background:white;justify-content:center"></iframe>

<hr />
<h2 id="heading-how-to-implement-pdr-using-langchain">How to implement PDR using LangChain?</h2>
<p>You can go through the entire code in this notebook. Sharing some important snippets below.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain.schema <span class="hljs-keyword">import</span> Document
<span class="hljs-keyword">from</span> langchain.vectorstores <span class="hljs-keyword">import</span> Chroma
<span class="hljs-keyword">from</span> langchain.retrievers <span class="hljs-keyword">import</span> ParentDocumentRetriever
<span class="hljs-keyword">from</span> langchain.chains <span class="hljs-keyword">import</span> RetrievalQA
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> OpenAI
<span class="hljs-keyword">from</span> langchain.text_splitter <span class="hljs-keyword">import</span> RecursiveCharacterTextSplitter
<span class="hljs-keyword">from</span> langchain.storage <span class="hljs-keyword">import</span> InMemoryStore
<span class="hljs-keyword">from</span> langchain.document_loaders <span class="hljs-keyword">import</span> TextLoader,WebBaseLoader
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> OpenAIEmbeddings
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI
<span class="hljs-keyword">from</span> langchain_core.output_parsers <span class="hljs-keyword">import</span> StrOutputParser
<span class="hljs-keyword">from</span> langchain_core.prompts <span class="hljs-keyword">import</span> ChatPromptTemplate
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> tiktoken
<span class="hljs-keyword">from</span> google.colab <span class="hljs-keyword">import</span> userdata

  <span class="hljs-comment"># Loading a single website</span>
loaders = [
    WebBaseLoader(<span class="hljs-string">"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3864624/"</span>),
    WebBaseLoader(<span class="hljs-string">"https://www.mayoclinic.org/diseases-conditions/lung-cancer/symptoms-causes/syc-20374620"</span>)
]

docs = []
<span class="hljs-keyword">for</span> loader <span class="hljs-keyword">in</span> loaders:
    token_count = num_tokens_from_string(str(loader.load()),<span class="hljs-string">"cl100k_base"</span>)
    print(<span class="hljs-string">f"Tokens for <span class="hljs-subst">{loader.web_path}</span>: <span class="hljs-subst">{token_count}</span>"</span>)
    docs.extend(loader.load())
</code></pre>
<p>Output below</p>
<pre><code class="lang-plaintext">Tokens for https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3864624/: 43869
Tokens for https://www.mayoclinic.org/diseases-conditions/lung-cancer/symptoms-causes/syc-20374620: 4997
</code></pre>
<p>Create parentsplitter and childsplitter using the RecursiveCharacterTextSplitter module in LangChain.  </p>
<p>Use Chrom DB as a vector store and LangChain’s InMemoryStore to store large docs or chunks.</p>
<pre><code class="lang-python"><span class="hljs-comment"># This text splitter is used to create the parent documents</span>
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=<span class="hljs-number">2000</span>)
<span class="hljs-comment"># This text splitter is used to create the child documents</span>
<span class="hljs-comment"># It should create documents smaller than the parent</span>
child_splitter = RecursiveCharacterTextSplitter(chunk_size=<span class="hljs-number">400</span>)
<span class="hljs-comment"># The vectorstore to use to index the child chunks</span>
vectorstore = Chroma(
    collection_name=<span class="hljs-string">"split_parents"</span>, embedding_function=OpenAIEmbeddings(model=<span class="hljs-string">"text-embedding-3-small"</span>)
)
<span class="hljs-comment"># The storage layer for the parent documents</span>
store = InMemoryStore()
</code></pre>
<p>Instantiate the ParentDocumentRetriever class using the above as parameters.  </p>
<pre><code class="lang-python">retriever = ParentDocumentRetriever(
    vectorstore=vectorstore,
    docstore=store,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter,
)
</code></pre>
<p>Add the docs we created using the WebPageLoader to the retriever object.</p>
<pre><code class="lang-python">retriever.add_documents(docs)
</code></pre>
<p>Test the vector store with a query.</p>
<pre><code class="lang-python">sub_docs = vectorstore.similarity_search(<span class="hljs-string">"What are the symptoms of Lung Cancer?"</span>)
print(sub_docs[<span class="hljs-number">0</span>].page_content)
</code></pre>
<p>The response.</p>
<pre><code class="lang-plaintext">SymptomsLung cancer typically doesn't cause symptoms early on. Symptoms of lung cancer usually happen when the disease is advanced.
Signs and symptoms of lung cancer that happen in and around the lungs may include:

A new cough that doesn't go away.
Chest pain.
Coughing up blood, even a small amount.
Hoarseness.
Shortness of breath.
Wheezing.
</code></pre>
<p>Let’s see the parent doc that is returned for the same query using the retriever object.</p>
<pre><code class="lang-python">retrieved_docs = retriever.invoke(<span class="hljs-string">"What are the symptoms of Lung Cancer?"</span>)
print(retrieved_docs[<span class="hljs-number">0</span>].page_content)
</code></pre>
<p>Response.</p>
<pre><code class="lang-plaintext">Patient Care &amp; Health Information
Diseases &amp; Conditions


Lung cancer

Request an Appointment
Symptoms &amp;causesDiagnosis &amp;treatmentDoctors &amp;departmentsCare atMayo Clinic






Print




Overview




        Lung cancer
        Enlarge image









Close



Lung cancer


Lung cancer
Lung cancer begins in the cells of the lungs.





Lung cancer is a kind of cancer that starts as a growth of cells in the lungs. The lungs are two spongy organs in the chest that control breathing.
Lung cancer is the leading cause of cancer deaths worldwide.
People who smoke have the greatest risk of lung cancer. The risk of lung cancer increases with the length of time and number of cigarettes smoked. Quitting smoking, even after smoking for many years, significantly lowers the chances of developing lung cancer. Lung cancer also can happen in people who have never smoked.Products &amp; ServicesA Book: Mayo Clinic Family Health BookNewsletter: Mayo Clinic Health Letter — Digital EditionShow more products from Mayo Clinic





SymptomsLung cancer typically doesn't cause symptoms early on. Symptoms of lung cancer usually happen when the disease is advanced.
Signs and symptoms of lung cancer that happen in and around the lungs may include:

A new cough that doesn't go away.
Chest pain.
Coughing up blood, even a small amount.
Hoarseness.
Shortness of breath.
Wheezing.

Signs and symptoms that happen when lung cancer spreads to other parts of the body may include:

Bone pain.
Headache.
Losing weight without trying.
Loss of appetite.
Swelling in the face or neck.

When to see a doctorMake an appointment with your doctor or other healthcare professional if you have any symptoms that worry you.
If you smoke and haven't been able to quit, make an appointment. Your healthcare professional can recommend strategies for quitting smoking. These may include counseling, medicines and nicotine replacement products.
Request an appointment
1925
</code></pre>
<p>Let’s create a chain using an LLM to generate responses using the query and the context from the retriever.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_core.runnables <span class="hljs-keyword">import</span> RunnablePassthrough

<span class="hljs-comment"># Prompt template</span>
template = <span class="hljs-string">"""Answer the question based only on the following context:
{context}
Question: {question}
"""</span>
prompt = ChatPromptTemplate.from_template(template)

<span class="hljs-comment"># LLM</span>
model = ChatOpenAI(temperature=<span class="hljs-number">0</span>, model=<span class="hljs-string">"gpt-4o-mini"</span>)

<span class="hljs-comment"># RAG pipeline</span>
chain = (
    {<span class="hljs-string">"context"</span>: retriever, <span class="hljs-string">"question"</span>: RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

chain.invoke(<span class="hljs-string">"How are smoking and Lung Cancer related?"</span>)
</code></pre>
<p>Response</p>
<pre><code class="lang-plaintext">Smoking is the primary cause of most lung cancers.
It introduces cancer-causing substances, known as carcinogens, into the lungs, which damage the cells that line the lung tissue.
This damage can lead to changes in the cells' DNA, causing them to grow and multiply uncontrollably, ultimately resulting in cancer.
Additionally, smoking can also affect non-smokers through secondhand smoke exposure. While lung cancer can occur in individuals who have never smoked, the exact causes in these cases may not be clear. Overall, smoking significantly increases the risk of developing lung cancer.
</code></pre>
<p>As you can see the response seems to be pretty neat and grounded.</p>
<p>Let me know in the comments how your experiments went.</p>
<hr />
<iframe src="https://newsletter.adaptiveengineer.com/embed" width="480" height="320" style="border:1px solid #EEE;background:white;justify-content:center"></iframe>]]></content:encoded></item><item><title><![CDATA[The Untold Story of the Engineer Who Saved React.JS]]></title><description><![CDATA[This recent social media post by Rahul Pandey, co-founder of Taro Community and ex-Meta Staff Engineer, really caught my attention.
In it, Rahul shares a story about an engineer from Khan Academy who, back in 2013, got hired by Facebook because of he...]]></description><link>https://zahere.com/the-untold-story-of-the-engineer-who-saved-reactjs</link><guid isPermaLink="true">https://zahere.com/the-untold-story-of-the-engineer-who-saved-reactjs</guid><category><![CDATA[Open Source]]></category><category><![CDATA[stories]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Sat, 28 Sep 2024 19:08:06 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1727549767658/f367ec6d-183f-4460-bae3-38097b03a29f.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727548509894/e4e1539c-0d22-4bdd-9810-af5780ee545f.png" alt class="image--center mx-auto" /></p>
<p>This recent social media post by Rahul Pandey, co-founder of Taro Community and ex-Meta Staff Engineer, really caught my attention.</p>
<p>In it, Rahul shares a story about an engineer from Khan Academy who, back in 2013, got hired by Facebook because of her open-source contributions to React. Naturally, I was curious and dove into research—and what I found was mind-blowing.</p>
<p>This engineer didn’t just contribute to open source; she actually <strong>saved ReactJS</strong>. Let me tell you the story of <strong>Sophie Alpert</strong>.</p>
<p><strong>Sophie Alpert’s journey</strong> in tech is the stuff of movies—an inspiring narrative filled with passion, innovation, and leadership.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727548553175/0fc1a735-eb51-43e5-a1a5-6e5de5200f31.png" alt class="image--center mx-auto" /></p>
<p>When we think about engineers reaching the heights of people like Zuckerberg or Musk, the odds are staggering—maybe one in a billion.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727548445404/ddf0f3e5-b4d1-42a4-8b3a-713b4c59a327.png" alt class="image--center mx-auto" /></p>
<p>But Sophie's path shows us what every engineer truly aspires to: not just a successful career at top companies, but real, lasting contributions to the community.</p>
<hr />
<p>If you prefer watching a video  </p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=VeT29QPYyYg">https://www.youtube.com/watch?v=VeT29QPYyYg</a></div>
<p> </p>
<hr />
<h3 id="heading-early-life-and-education"><strong>Early Life and Education</strong></h3>
<p>Sophie’s story begins in Colorado. Her parents, both familiar with coding, nurtured her love for technology from an early age. While other kids were reading comic books or novels, Sophie was reading computer manuals for fun. That’s how deep her passion ran.</p>
<p>By middle and high school, she was creating websites using Dreamweaver, coding personal projects, and even doing freelance work. Her skills were evolving fast.</p>
<p>She later went on to study computer science at <strong>Carnegie Mellon University</strong>, but the traditional classroom wasn’t cutting it for her. Sophie craved hands-on experience. So, after a summer internship at <strong>Khan Academy</strong>, where she developed interactive math tools, she made a bold decision—she dropped out of college to work there full-time.</p>
<h3 id="heading-career-milestones"><strong>Career Milestones</strong></h3>
<p>At <strong>Khan Academy</strong>, Sophie’s contributions were pivotal. She helped develop educational tools that transformed user experiences, reinforcing her belief that practical applications beat out theoretical knowledge any day.</p>
<p>But it was her <strong>transition to the React core team</strong> at Facebook that marked a defining moment in her career. Initially, Sophie was just an enthusiastic open-source contributor, but she didn’t stop there. Over time, her work was so impactful that she was invited to <strong>lead</strong> the React team at Facebook.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727549464687/cc2b1bfc-aafc-4340-970a-1ded7b7472b4.png" alt class="image--center mx-auto" /></p>
<p>Her focus? Making React more efficient and accessible for developers.</p>
<p>There’s a fantastic documentary by the YouTube channel <strong>HoneyPot</strong> that goes into this transition in detail, and it highlights just how instrumental Sophie’s contributions were.</p>
<p>Her work with React was so good, in fact, that Facebook couldn’t resist hiring her.</p>
<p>And as they say—the rest is history.</p>
<p>Today, React is <strong>the most popular web framework</strong> in the world, and its influence only continues to grow. Sure, open-source is a community effort, but if it weren’t for Sophie’s contributions and her fresh perspective back in 2013, React could’ve easily been just another failed tech experiment.</p>
<h3 id="heading-insights-and-philosophy"><strong>Insights and Philosophy</strong></h3>
<p>Throughout her career, Sophie has been vocal about the importance of side projects. To her, these unstructured projects are the ultimate test of creativity and problem-solving. They’re where real passion shines, and they often lead to breakthroughs that traditional work can’t always offer.</p>
<hr />
<p><strong>Sophie Alpert’s story</strong> is a testament to following your passion, breaking norms, and carving your own path in tech. She shows us that real-world experience can often lead to industry-changing contributions—no degree required.</p>
]]></content:encoded></item><item><title><![CDATA[Unlocking Deep Context: Why You Should Try Multi-Representation Indexing]]></title><description><![CDATA[Video
https://www.youtube.com/watch?v=v09v327xIdE
 
Picture this: AI, not just searching for keywords, but truly grasping the intent behind your query. An AI that doesn’t just retrieve what you asked for, but what you need to know. 
This is the third...]]></description><link>https://zahere.com/unlocking-deep-context-why-you-should-try-multi-representation-indexing</link><guid isPermaLink="true">https://zahere.com/unlocking-deep-context-why-you-should-try-multi-representation-indexing</guid><category><![CDATA[RAG ]]></category><category><![CDATA[advanced rag]]></category><category><![CDATA[indexing]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[Retrieval-Augmented Generation]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Mon, 23 Sep 2024 09:17:05 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1727081268825/9730e0f7-0c2e-4e63-907f-ab4009804c00.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-video">Video</h2>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=v09v327xIdE">https://www.youtube.com/watch?v=v09v327xIdE</a></div>
<p> </p>
<p>Picture this: AI, not just searching for keywords, but truly grasping the intent behind your query. An AI that doesn’t just retrieve what you <em>asked</em> for, but what you <em>need</em> to know. </p>
<p>This is the third article in the <a target="_blank" href="https://zahere.com/series/ai">Advanced RAG series</a>, where we are following the journey of TechnoHealth Solution and how they implement RAG for their several use cases.</p>
<p>Today, we’re unlocking one of the most efficient indexing techniques — multi-representation indexing.</p>
<h2 id="heading-what-is-multi-representation-indexing">What is Multi-Representation Indexing?</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727081442960/ad0d6a5a-d68d-4221-8815-e660d406d145.png" alt class="image--center mx-auto" /></p>
<p>Based on the paper Dense X Retrieval or Proposition-based Retrieval, Multi-representation indexing involves creating and storing representations of each document within the retrieval system.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727081459617/6721d374-9918-4ed0-a168-1310e3db6631.png" alt class="image--center mx-auto" /></p>
<p>Representations here could mean traditional keyword analysis, deep semantic understanding or summary, and even visual elements like images or diagrams of the documents.</p>
<h2 id="heading-why-is-multi-representation-indexing-used">Why is Multi-Representation Indexing Used?</h2>
<p>Multi-representation indexing improves accuracy, adapts to different types of documents, and is flexible enough to handle complex information, like research papers, code, or even e-commerce product listings</p>
<p>The primary motivations for employing multi-representation indexing include:</p>
<ul>
<li><p>Improved Retrieval Accuracy</p>
</li>
<li><p>Contextual Understanding</p>
</li>
<li><p>Flexibility for Document Types</p>
</li>
<li><p>Handling Complex Information</p>
</li>
</ul>
<h2 id="heading-when-is-multi-representation-indexing-used">When is Multi-Representation Indexing Used?</h2>
<p>Multi-representation indexing is particularly beneficial in scenarios where:</p>
<ul>
<li><p>Complex Queries Are Common: Users often ask nuanced questions requiring understanding beyond simple keyword matching.</p>
</li>
<li><p>Diverse Document Formats Are Involved: Environments where documents come in various formats and types, such as academic research or product catalogs.</p>
</li>
<li><p>Applications Require Enhanced Semantic Understanding: For applications benefiting from deeper semantic insights, such as conversational agents or advanced search engines.</p>
</li>
</ul>
<h2 id="heading-where-is-multi-representation-indexing-used"><strong>Where is Multi-Representation Indexing Used?</strong></h2>
<p>This advanced indexing technique is applied across various domains, improving the relevance and depth of information retrieval mainly</p>
<ol>
<li><p><strong>Search Engines and Information Retrieval Systems</strong></p>
</li>
<li><p><strong>Conversational AI and ChatbotsWho benefits from multi-representation indexing?</strong></p>
</li>
</ol>
<ul>
<li><strong>End-users</strong> interacting with AI systems, such as search engines, chatbots, or recommendation engines, benefit from more accurate and contextually relevant information. This includes professionals in various fields:</li>
</ul>
<hr />
<iframe src="https://newsletter.adaptiveengineer.com/embed" width="480" height="320" style="border:1px solid #EEE;background:white;justify-content:center"></iframe>

<hr />
<h2 id="heading-how-multi-representation-indexing-works">How Multi-Representation Indexing Works</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727081696653/f4205364-7fbd-4696-9938-0f8ef244b4a6.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-implementation-steps">Implementation Steps</h3>
<ol>
<li><p><strong>Document Loading and Chunking</strong></p>
<ul>
<li>Documents are loaded from sources such as web pages and split into manageable chunks using techniques like recursive character splitting.</li>
</ul>
</li>
<li><p><strong>Summarization</strong></p>
<ul>
<li>A Language Model (LLM) chain is used to generate concise summaries of each chunk, preserving essential details crucial for later retrieval.</li>
</ul>
</li>
<li><p><strong>Setting Up the Multi-Vector Retriever</strong></p>
<ul>
<li>A <code>MultiVectorRetriever</code> instance is initialized, linking the vector store (optimized summaries) and document store (original chunks) using unique keys.</li>
</ul>
</li>
<li><p><strong>Adding Documents</strong></p>
<ul>
<li>Optimized summaries and original chunks are added to their respective stores, ensuring they are linked correctly for retrieval.</li>
</ul>
</li>
<li><p><strong>Query and Retrieval</strong></p>
<ul>
<li>Queries are processed using the <code>MultiVectorRetriever</code>, leveraging the optimized summaries for efficient document retrieval based on relevance.</li>
</ul>
</li>
</ol>
<p>I’ll add two key snippets below.</p>
<ol>
<li>We are extracting representations like table data and text from this <a target="_blank" href="https://github.com/zahere-dev/multi-representation-indexing-advanced-rag/blob/main/Immunotherapy_in_Non-Small-Cell_Lung_Cancer.pdf">pdf</a> using the <a target="_blank" href="https://docs.unstructured.io/open-source/core-functionality/partitioning">Unstructured</a> library.</li>
</ol>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> Any

<span class="hljs-keyword">from</span> pydantic <span class="hljs-keyword">import</span> BaseModel
<span class="hljs-keyword">from</span> unstructured.partition.pdf <span class="hljs-keyword">import</span> partition_pdf

<span class="hljs-comment"># Get elements</span>
raw_pdf_elements = partition_pdf(
    filename=path,
    <span class="hljs-comment"># Unstructured first finds embedded image blocks</span>
    extract_images_in_pdf=<span class="hljs-literal">False</span>,
    <span class="hljs-comment"># Use layout model (YOLOX) to get bounding boxes (for tables) and find titles</span>
    <span class="hljs-comment"># Titles are any sub-section of the document</span>
    infer_table_structure=<span class="hljs-literal">True</span>,
    <span class="hljs-comment"># Post processing to aggregate text once we have the title</span>
    chunking_strategy=<span class="hljs-string">"by_title"</span>,
    <span class="hljs-comment"># Chunking params to aggregate text blocks</span>
    <span class="hljs-comment"># Attempt to create a new chunk 3800 chars</span>
    <span class="hljs-comment"># Attempt to keep chunks &gt; 2000 chars</span>
    max_characters=<span class="hljs-number">4000</span>,
    new_after_n_chars=<span class="hljs-number">3800</span>,
    combine_text_under_n_chars=<span class="hljs-number">2000</span>,
    image_output_dir_path=path,
)
</code></pre>
<ol start="2">
<li>Using the MultiVectorRetriever module to leverage two data stores - one for summaries and other for full docs.</li>
</ol>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> uuid

<span class="hljs-keyword">from</span> langchain.retrievers.multi_vector <span class="hljs-keyword">import</span> MultiVectorRetriever
<span class="hljs-keyword">from</span> langchain.storage <span class="hljs-keyword">import</span> InMemoryStore
<span class="hljs-keyword">from</span> langchain_chroma <span class="hljs-keyword">import</span> Chroma
<span class="hljs-keyword">from</span> langchain_core.documents <span class="hljs-keyword">import</span> Document
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> OpenAIEmbeddings

<span class="hljs-comment"># The vectorstore to use to index the child chunks</span>
vectorstore = Chroma(collection_name=<span class="hljs-string">"summaries"</span>, embedding_function=OpenAIEmbeddings(model=<span class="hljs-string">"text-embedding-3-small"</span>))

<span class="hljs-comment"># The storage layer for the parent documents</span>
store = InMemoryStore()
id_key = <span class="hljs-string">"doc_id"</span>

<span class="hljs-comment"># The retriever (empty to start)</span>
retriever = MultiVectorRetriever(
    vectorstore=vectorstore,
    docstore=store,
    id_key=id_key,
)

<span class="hljs-comment"># Add texts</span>
doc_ids = [str(uuid.uuid4()) <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> texts]
summary_texts = [
    Document(page_content=s, metadata={id_key: doc_ids[i]})
    <span class="hljs-keyword">for</span> i, s <span class="hljs-keyword">in</span> enumerate(text_summaries)
]
retriever.vectorstore.add_documents(summary_texts)
retriever.docstore.mset(list(zip(doc_ids, texts)))

<span class="hljs-comment"># Add tables</span>
table_ids = [str(uuid.uuid4()) <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> tables]
summary_tables = [
    Document(page_content=s, metadata={id_key: table_ids[i]})
    <span class="hljs-keyword">for</span> i, s <span class="hljs-keyword">in</span> enumerate(table_summaries)
]
retriever.vectorstore.add_documents(summary_tables)
retriever.docstore.mset(list(zip(table_ids, tables)))
</code></pre>
<p>We then leverage the retriever in the RAG chaing and this is the final output after invoking the object with the user query.</p>
<pre><code class="lang-python">chain.invoke(<span class="hljs-string">"What is Dual Immunotherapy without Chemotherapy?"</span>)

Dual Immunotherapy without Chemotherapy refers to a treatment approach <span class="hljs-keyword">for</span> metastatic non-small cell lung cancer (NSCLC) that involves the use of two immune checkpoint inhibitors (ICIs) without the addition of chemotherapy. An example of this <span class="hljs-keyword">is</span> the combination of nivolumab <span class="hljs-keyword">and</span> ipilimumab, which has been FDA-approved <span class="hljs-keyword">for</span> first-line treatment <span class="hljs-keyword">in</span> patients <span class="hljs-keyword">with</span> PD-L1 expression ≥ <span class="hljs-number">1</span>% <span class="hljs-keyword">and</span> without EGFR/ALK alterations. 

In the phase III trial Checkmate <span class="hljs-number">227</span>, this combination demonstrated a positive overall survival (OS) benefit across all PD-L1 expression subgroups, <span class="hljs-keyword">with</span> a <span class="hljs-number">4</span>-year OS rate of <span class="hljs-number">29</span>% <span class="hljs-keyword">for</span> patients receiving nivolumab plus ipilimumab compared to <span class="hljs-number">18</span>% <span class="hljs-keyword">for</span> those receiving chemotherapy. This indicates that dual immunotherapy can provide significant long-term survival benefits <span class="hljs-keyword">for</span> patients <span class="hljs-keyword">with</span> advanced NSCLC.
</code></pre>
<p>For full code and check out the github repo below.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://github.com/zahere-dev/multi-representation-indexing-advanced-rag">https://github.com/zahere-dev/multi-representation-indexing-advanced-rag</a></div>
<p> </p>
<hr />
<iframe src="https://newsletter.adaptiveengineer.com/embed" width="480" height="320" style="border:1px solid #EEE;background:white;justify-content:center"></iframe>

<hr />
]]></content:encoded></item><item><title><![CDATA[Book Review Snippets: The Coming Wave - Chapter 1: Containment is Not Possible]]></title><description><![CDATA[In chapter 1, Mustafa's book explores the concept of "waves" in human history, focusing on the transformative impact of technology and the existential risks posed by artificial intelligence and synthetic biology.
Mustafa explains that human history i...]]></description><link>https://zahere.com/book-review-snippets-the-coming-wave-chapter-1-containment-is-not-possible</link><guid isPermaLink="true">https://zahere.com/book-review-snippets-the-coming-wave-chapter-1-containment-is-not-possible</guid><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[book summary]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Tue, 17 Sep 2024 08:48:56 GMT</pubDate><content:encoded><![CDATA[<p>In chapter 1, Mustafa's book explores the concept of "waves" in human history, focusing on the transformative impact of technology and the existential risks posed by artificial intelligence and synthetic biology.</p>
<p>Mustafa explains that human history is shaped by world-changing events, which he calls waves. These waves can be both physical, like natural disasters, and metaphorical, such as the rise of empires, wars, and religion. Over the past two centuries, technology has emerged as a dominant wave, reshaping the world in unprecedented ways. Mustafa refers to this phenomenon as the rise of <em>Homo Technologicus</em>—humans as technological beings.</p>
<p>He argues that the next major wave will be driven by two core technologies: artificial intelligence and synthetic biology. These technologies, he warns, cannot be fully controlled, and their consequences could be dire.</p>
<p><strong>The Dilemma</strong>: Mustafa highlights the dilemma nations face in the coming wave. Ignoring AI and synthetic biology would mean missing out on their benefits, while pursuing them risks potentially catastrophic consequences. This tension defines the core dilemma: how to balance the need for progress with the existential risks these technologies present. Mustafa believes that the key to navigating this dilemma is containment, though it may not be entirely achievable.</p>
<p><strong>The Trap</strong>: Mustafa points out that many people fall into the trap of "pessimism aversion," an emotional response where they dismiss potential dangers, assuming that humanity will somehow manage. He argues that this optimistic outlook hinders serious consideration of the risks posed by these technologies.</p>
]]></content:encoded></item><item><title><![CDATA[Mastering Chunking for RAG: Semantic vs Recursive vs Fixed Size]]></title><description><![CDATA[Note: The read-time of this article was going beyond 4 minutes, so I am sharing the video instead.  
This is part of the Advanced RAG Series: Part 1
When working with Retrieval Augmented Generation (RAG) models, selecting the right chunking method ca...]]></description><link>https://zahere.com/mastering-chunking-for-rag-semantic-vs-recursive-vs-fixed-size</link><guid isPermaLink="true">https://zahere.com/mastering-chunking-for-rag-semantic-vs-recursive-vs-fixed-size</guid><category><![CDATA[RAG ]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[advanced rag]]></category><category><![CDATA[AI]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Mon, 16 Sep 2024 09:07:28 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1726477366647/b0223b8e-9eec-429a-9e23-2e4803112931.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Note: The read-time of this article was going beyond 4 minutes, so I am sharing the video instead.  </p>
<p>This is part of the Advanced RAG Series: <a target="_blank" href="https://zahere.com/rag-explained-how-this-company-implemented-retrieval-augmented-generation">Part 1</a></p>
<p>When working with Retrieval Augmented Generation (RAG) models, selecting the right chunking method can make a huge difference in performance.</p>
<p>In my latest YouTube video, I dive deep into the three main chunking approaches—<strong>Semantic</strong>, <strong>Recursive</strong>, and <strong>Fixed Size</strong>—and evaluate their performance based on four critical metrics: context precision, faithfulness, answer relevancy, and context recall.</p>
<p>The chunking method you choose can impact how accurate and relevant the AI-generated answers are. So, which method strikes the perfect balance between retaining enough context and providing highly relevant, faithful responses?</p>
<p>In the video, I break down:</p>
<ul>
<li><p>How <strong>Semantic Chunking</strong> performed in capturing context but struggled with relevancy.</p>
</li>
<li><p>Why <strong>Recursive Chunking</strong> emerged as a strong contender with high accuracy and relevancy.</p>
</li>
<li><p>The surprising strengths of <strong>Fixed Size Chunking</strong>, especially in context retention.</p>
</li>
</ul>
<p>If you're interested in fine-tuning your RAG models or curious about which chunking method works best, this video is packed with insights that will help you make the right choice. Check out the full breakdown in the embedded video below!</p>
<hr />
<p><em>Watch the full analysis and find out which chunking method is best for your use case:</em>  </p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=jEzh4IuTWtc">https://www.youtube.com/watch?v=jEzh4IuTWtc</a></div>
<p> </p>
<hr />
<iframe src="https://newsletter.adaptiveengineer.com/embed" width="480" height="320" style="border:1px solid #EEE;background:white;justify-content:center"></iframe>]]></content:encoded></item><item><title><![CDATA[RAG Explained: How 'This' Company Implemented Retrieval-Augmented Generation]]></title><description><![CDATA[Video
https://www.youtube.com/watch?v=fpbyPm5MZSM
 
The Context
TechnoHealth Solutions, a fictitious, mid-sized tech company that builds software for hospitals,  has identified a critical issue facing healthcare professionals -  information overload ...]]></description><link>https://zahere.com/rag-explained-how-this-company-implemented-retrieval-augmented-generation</link><guid isPermaLink="true">https://zahere.com/rag-explained-how-this-company-implemented-retrieval-augmented-generation</guid><category><![CDATA[generative ai]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[Retrieval-Augmented Generation]]></category><category><![CDATA[genai]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Mon, 09 Sep 2024 11:43:02 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1725882089058/84295ff7-5be2-4e72-b59a-4e19f083d215.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-video">Video</h2>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=fpbyPm5MZSM">https://www.youtube.com/watch?v=fpbyPm5MZSM</a></div>
<p> </p>
<h2 id="heading-the-context">The Context</h2>
<p>TechnoHealth Solutions, a fictitious, mid-sized tech company that builds software for hospitals,  has identified a critical issue facing healthcare professionals -  information overload and outdated knowledge.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725881392861/4f16602f-1d1b-4f0c-8f8e-a86560f411fc.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-the-problem"><strong>The Problem:</strong></h3>
<p>Picture this:</p>
<ul>
<li><p>Doctors spend hours buried in research instead of treating patients</p>
</li>
<li><p>Nurses struggling to keep up with thousands of new medical studies published every week</p>
</li>
<li><p>Patients receiving inconsistent care because their providers can't access the latest information</p>
</li>
<li><p>The constant fear of medical errors due to outdated knowledge</p>
</li>
</ul>
<p>It's a healthcare nightmare, and it's happening right now in hospitals around the world.</p>
<h3 id="heading-the-failed-solutions"><strong>The Failed Solutions:</strong></h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725881431371/162c3aa4-a709-4f63-904e-5089edacfff3.png" alt class="image--center mx-auto" /></p>
<p>TechnoHealth Solutions first thought, "Let's just build a better search engine!" But they quickly realized that would only add to the problem. Doctors don't have time to sift through endless documents.</p>
<p>Then they considered an AI chatbot. But two major roadblocks appeared:</p>
<ol>
<li><p>The risk of AI "hallucinations" – making up false medical information</p>
</li>
<li><p>The sheer volume of medical data exceeded what current AI systems could handle</p>
</li>
</ol>
<h2 id="heading-the-breakthrough"><strong>The Breakthrough:</strong></h2>
<p>Finally, TechnoHealth Solutions developed MedAssist AI, a system that leverages Retrieval Augmented Generation (RAG) technology to address these challenges. Here's how RAG solved their problem:</p>
<p>But what exactly is RAG, and how does it solve this medical knowledge crisis?</p>
<h2 id="heading-what-is-rag"><strong>What is RAG?</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725881514362/6689cb02-85c0-46a5-9ae0-feb3477d8387.png" alt class="image--center mx-auto" /></p>
<p>Here’s how it works.</p>
<p>Retrieval: When a user asks a question or provides a prompt, RAG first retrieves relevant passages from a vast knowledge base. This knowledge base could be the internet, a company’s internal documents, or any other source of text data.</p>
<p>Augmentation: The retrieved passages are then used to “augment” the LLM’s knowledge. This can involve various techniques, such as summarizing or encoding the key information.</p>
<p>Generation: Finally, the LLM leverages its understanding of language along with the augmented information to generate a response. This response can be an answer to a question, a creative text format based on a prompt, or any other form of text generation.</p>
<h2 id="heading-why-rag"><strong>Why RAG?</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725881561025/ea7738a4-5dd3-4c0e-a127-22b8aee5102b.png" alt class="image--center mx-auto" /></p>
<p><strong>Boosted Factual Accuracy</strong>: RAG strengthens LLMs by connecting them to external sources of information, like databases or live feeds. This ensures that their responses are based on real-world facts rather than relying solely on what they were trained on.  </p>
<p><strong>Expert Knowledge in Specific Domains</strong>: General-purpose LLMs are like a jack of all trades—they know a bit about everything but aren’t experts in any one field. With RAG, you can integrate specific knowledge bases, allowing the AI to answer highly specialized questions.  </p>
<p><strong>Fewer Mistakes (Reduced Hallucination)</strong>: LLMs sometimes make up information that sounds convincing but isn’t true. RAG reduces this risk by providing the AI with reliable sources to back up its claims, leading to more trustworthy responses.</p>
<p><strong>Adaptability to New Information</strong>: The world is constantly evolving, and LLMs trained on older data can quickly become outdated. RAG solves this by giving AI access to up-to-date sources, so it can always provide current information.</p>
<p><strong>Customizable and Scalable</strong>: RAG isn’t a one-size-fits-all solution. It can be adjusted to fit your needs, whether you're working with limited resources or require more power for complex tasks.</p>
<h2 id="heading-technohealth-solutions-rag-implementation"><strong>TechnoHealth Solutions’ RAG Implementation</strong></h2>
<p>Here’s how TechnoHealth Solutions built their RAG solution to make their AI smarter, faster, and more reliable.</p>
<h3 id="heading-step-1-building-an-indexing-pipeline"><strong>Step 1: Building an Indexing Pipeline</strong></h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725881614479/8a9f3598-2011-436b-80af-5e500d2f545f.png" alt class="image--center mx-auto" /></p>
<p>The first step was to <strong>organize and process their data</strong>. TechnoHealth consolidated all their data sources, from medical reports to research papers. They then:</p>
<ul>
<li><p><strong>Processed the documents</strong> by breaking them down into smaller, manageable chunks.</p>
</li>
<li><p>Passed these chunks through an <strong>embedding model</strong>, which turned them into vector representations, or "embeddings." These are mathematical versions of the text that make it easy for the system to compare and find similarities.</p>
</li>
<li><p>Finally, they stored all these embeddings in a <strong>vector database</strong>. Think of this as a specialized storage space for these mathematical text chunks.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725881688417/60e71032-57a0-4a43-b880-0081a4a6403a.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-step-2-building-the-retrieval-system"><strong>Step 2: Building the Retrieval System</strong></h3>
<p>When a user asks a question, TechnoHealth’s retrieval system goes to work:</p>
<ul>
<li><p>It first <strong>converts the user’s question</strong> into an embedding (just like they did with the documents).</p>
</li>
<li><p>Then, it <strong>compares</strong> this question embedding to all the text chunks stored in the vector database to find the <strong>most similar chunks</strong>.</p>
</li>
</ul>
<h3 id="heading-step-3-augmenting-the-llm"><strong>Step 3: Augmenting the LLM</strong></h3>
<p>Now comes the magic of <strong>augmenting the LLM’s knowledge</strong>:</p>
<ul>
<li><p>The system crafts a <strong>prompt</strong> that includes:</p>
<ul>
<li><p><strong>Instructions</strong> for the LLM.</p>
</li>
<li><p>The <strong>user’s question</strong>.</p>
</li>
<li><p>The <strong>most relevant documents</strong> retrieved from the database (the context).</p>
</li>
</ul>
</li>
<li><p>This carefully crafted prompt tells the LLM how to use the data to generate a better response</p>
</li>
</ul>
<h3 id="heading-step-4-generating-the-response"><strong>Step 4: Generating the Response</strong></h3>
<p>Finally, the crafted prompt is sent to the LLM:</p>
<ul>
<li><p>The LLM uses the <strong>instructions</strong> and the <strong>context</strong> to generate a complete, informed response to the user’s query.</p>
</li>
<li><p>The response is sent back to the user, more accurate and grounded in real-world data than if the LLM had answered from its general knowledge alone.</p>
</li>
</ul>
<h2 id="heading-conclusion"><strong>Conclusion</strong>:</h2>
<p>By combining indexing, retrieval, augmentation, and generation, TechnoHealth Solutions built a RAG system that ensures every answer their AI provides is based on real, up-to-date knowledge.</p>
<p>In the next article, we will talk about pre-retrieval optimizations.</p>
]]></content:encoded></item><item><title><![CDATA[A Glimpse Into the Future of Software Engineering]]></title><description><![CDATA[Video
https://www.youtube.com/watch?v=992AYyvMkDo
 

These two posts—just a few lines of text by two influential voices in the tech space—have ignited a storm of conversations.
My social media feeds are buzzing—filled with commentary, debates, and th...]]></description><link>https://zahere.com/a-glimpse-into-the-future-of-software-engineering</link><guid isPermaLink="true">https://zahere.com/a-glimpse-into-the-future-of-software-engineering</guid><category><![CDATA[AI]]></category><category><![CDATA[aiagents]]></category><category><![CDATA[coding]]></category><category><![CDATA[AI Coding Assistant]]></category><category><![CDATA[AI Code Generator]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Mon, 02 Sep 2024 08:27:36 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1725265473827/88fe1aff-4250-49a2-b4d5-6ecdf0c5b3a1.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-video">Video</h3>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=992AYyvMkDo">https://www.youtube.com/watch?v=992AYyvMkDo</a></div>
<p> </p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725264559502/aec1cf4f-58ad-45cf-b3be-a6ba7403202c.png" alt class="image--center mx-auto" /></p>
<p>These two posts—just a few lines of text by two influential voices in the tech space—have ignited a storm of conversations.</p>
<p>My social media feeds are buzzing—filled with commentary, debates, and threads that seem to extend endlessly. And the chatter isn’t dying down; in fact, it’s only growing louder.</p>
<p>What’s clear is this: They’re an indication, a signal—an alarm —that something has fundamentally shifted in the software industry.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725264630174/20a57ddb-34a8-49ca-8c07-471d1e54b138.png" alt class="image--center mx-auto" /></p>
<p>In a recent LinkedIn post, Amazon CEO Andy Jassy dropped a bombshell: Amazon Q, their generative AI assistant, has dramatically redefined what’s possible in software development.</p>
<p>Take this: what used to take 50 developer days to upgrade applications to Java 17 now takes mere hours. The equivalent of 4,500 developer-years of work saved—just like that.</p>
<p>Amazon upgraded more than half of its production Java systems in less than six months. A process that usually demands extensive time and resources was completed in a fraction of the time, at a fraction of the cost.</p>
<p>And the most striking part? 79% of the auto-generated code reviews were shipped by developers without any additional tweaks.</p>
<p>The AI didn’t just assist—it delivered.</p>
<p>And it’s not just the corporate giants making waves. Influential voices in AI research are also weighing in on the transformative power of AI in coding.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725264645500/4e3567da-14d2-4bbc-b55f-d16c3ce74bd5.png" alt class="image--center mx-auto" /></p>
<p>In a recent tweet, AI researcher Andrej Karpathy shared his experience with AI-assisted programming. He found these tools so effective that they’ve completely reshaped his coding workflow.</p>
<p>Karpathy describes a new way of coding—what he calls "half-coding." He writes prompts in plain English, reviews the AI-generated code diffs, and lets the AI handle the heavy lifting, completing substantial portions of code in record time.</p>
<p>Karpathy says - he can’t imagine going back to the way things were.</p>
<p>And neither can I. I’ve been using these code generators for over a year now and my productivity has positively  increased.</p>
<h2 id="heading-coding-is-undergoing-a-seismic-shift">Coding is Undergoing a Seismic Shift</h2>
<p>So let’s start with a fact: coding is about to or is already undergoing a seismic shift.</p>
<p>While I was researching about this topic, I stumbled across <a target="_blank" href="https://x.com/russelljkaplan/status/1820460525802926268">Russell Kaplan's thread</a> on <a target="_blank" href="http://x.com">x.com</a> which resonated well with me.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725265882930/150cab6d-e347-4047-85ff-ec4f1b9a4d3e.png" alt class="image--center mx-auto" /></p>
<p>It’s a fascinating thread - So I will break down some of the tweets that were thought-provoking.</p>
<p>Here’s why this matters. Research labs around the world are pouring resources into making AI models better at coding and reasoning. This isn’t just incremental progress—it’s a massive leap forward. These models are being trained to write code, reason through problems, and improve themselves in ways we haven’t seen before.</p>
<p>Why coding? What makes it so special?</p>
<p>The answer lies in the unique advantage coding offers: it’s a domain where AI can learn through “self-play.”</p>
<p>Unlike other fields where data is limited by human expertise, code can be tested, tweaked, and optimized automatically. It’s a playground for AI, where models can write code, run it, and check for consistency—all without human intervention. This kind of automatic supervision is not just beneficial; it’s revolutionary.</p>
<p>But what is self-play?</p>
<hr />
<iframe src="https://newsletter.adaptiveengineer.com/embed" width="480" height="320" style="border:1px solid #EEE;background:white;justify-content:center"></iframe>

<hr />
<p>To understand, self-play - lets learn about the model that pioneered the concept itself.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725264689573/84f2aaab-10eb-410c-9e42-c6bccd5f1526.png" alt class="image--center mx-auto" /></p>
<p>AlphaGo, developed by DeepMind, is a groundbreaking AI that made history by defeating human world champions in the ancient game of Go.</p>
<p>One of its key innovations was the use of self-play, where AlphaGo played countless games against versions of itself.</p>
<p>This technique allowed the AI to continuously improve, discovering new strategies and refining its decision-making process without human input.</p>
<p>Fascinating isn’t it.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725264713616/d66cdacd-6c39-42e7-bf73-eeae25b1735d.png" alt class="image--center mx-auto" /></p>
<p>Using this as an inspiration, researcher’s from Microsoft and MIT, in the paper “<strong>Language Models Can Teach Themselves to Program Better”</strong> demonstrated an answer to the question</p>
<p>“Can an LM design its own programming problems to improve its problemsolving ability?</p>
<p>Rather than using English problem descriptions which are ambiguous and hard to verify, they generated puzzles.</p>
<h2 id="heading-self-play-using-programming-puzzles"><strong>Self-play using programming puzzles</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725264732367/99401093-4c0f-478f-a5eb-aef5b1cf3694.png" alt class="image--center mx-auto" /></p>
<p>Here’s how they built the pipeline.</p>
<p>Puzzle Generation: The language model generates new puzzles by sampling from the training set, combining them, and creating additional puzzles within its context window. These puzzles are then filtered for syntactic validity and to exclude trivial solutions.</p>
<p>Solution Generation: The model attempts to solve the valid puzzles using a few-shot learning strategy, with a predetermined number of attempts per puzzle.</p>
<p>Solution Verification: Generated solutions are verified using a Python interpreter. From these, up to m correct and concise solutions are selected for each puzzle.</p>
<p>Fine-Tuning: The model is fine-tuned on the selected puzzle-solution pairs.</p>
<p>The result?</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725264748104/de2c9260-508e-431f-8725-ac5e6b7647ba.png" alt class="image--center mx-auto" /></p>
<p>The diagram illustrates how the iterative process of generating, verifying, and fine-tuning on synthetic data significantly improves the performance of the language model in solving puzzles.</p>
<p><strong>Initial Evaluation:</strong> Without any fine-tuning, GPT-Neo solves 7.5% of the held-out test puzzles.</p>
<p><strong>Evaluation After Fine-Tuning on Unverified Data:</strong> After fine-tuning on the unverified synthetic data, the model's performance improves, solving 21.5% of the held-out test puzzles.</p>
<p><strong>Evaluation After Fine-Tuning on Verified Data:</strong> Fine-tuning on the verified synthetic data further enhances the model's performance, allowing it to solve 38.2% of the held-out test puzzles.</p>
<p>Now, let’s look ahead.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725264828902/58797de2-9f43-43f9-b3a9-21877b6d2df5.png" alt class="image--center mx-auto" /></p>
<p>In just a few years, software engineering will be almost unrecognizable. Imagine having an army of coding agents at your disposal—each one capable of handling tasks from start to finish. This isn’t science fiction; it’s where we’re headed.</p>
<p>Engineers will transition from writing lines of code to managing these agents, overseeing the architecture of systems, and making high-level decisions.</p>
<p>This shift will redefine the role of the software engineer. In this new world, coding becomes less about the syntax and more about the strategy—understanding what needs to be built and why.</p>
<p>It’s like moving from being a craftsman to a project manager, where your focus is on the bigger picture.</p>
<p>To extend this idea, I return to Andrej Karpathy’s vision of AI resembling an operating system—essentially, a powerful agent. Andrej describes this as an entity with more knowledge than any single human on all subjects.</p>
<p>Now, imagine this as a specialized agent focused on a specific domain or business use case.</p>
<p>It’s possible, and we may not be very far from it.</p>
<p>Russel calls this “Software Abundance”</p>
<p>As coding becomes 10 times more accessible, we’ll see a proliferation of what can be called “single-use software”—apps and websites designed for specific, often one-off purposes.</p>
<p>This abundance won’t just democratize software creation; it will fundamentally change how we think about software itself. Imagine tools being created for specific events, or small businesses commissioning custom apps for limited-time campaigns.</p>
<p>What was once impractical will become routine.</p>
<p>As this new reality unfolds, the role of the software engineer will continue to evolve. Just as engineers transitioned from assembly language to high-level languages like Python, they will adapt to a world where English, rather than code, becomes the primary tool of communication.</p>
<p>This change will require a shift in mindset—from focusing on how to write code, to understanding what needs to be accomplished.</p>
<p>The job of a software engineer will become more about defining problems and architecting solutions, with coding agents handling the execution.</p>
<p>Conclusion</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725265219392/b8dbd72c-57d8-4b28-a5b2-24e72e0d9e7d.png" alt class="image--center mx-auto" /></p>
<p>We’re on the brink of a transformation that will not only change how we code but also how we think about building software altogether.</p>
<p>And what can help navigate the ambiguities during this phase is Adaptability.</p>
<p>If you’re as excited about this future as I am, hit the subscribe button to stay adaptable to the new possibilities.</p>
<p>Leave a comment below—how do you think AI will change the way you work or build?</p>
<hr />
<iframe src="https://newsletter.adaptiveengineer.com/embed" width="480" height="320" style="border:1px solid #EEE;background:white;justify-content:center"></iframe>

<hr />
]]></content:encoded></item><item><title><![CDATA[How Small Talk Can Boost Your Career and Get You Promoted]]></title><description><![CDATA[Video
https://www.youtube.com/watch?v=gQxO8bX98Jc
 
Picture this: You're at work, head down, churning out projects like a machine. You're efficient, you're productive—you're doing exactly what's expected of you, right?
Well, not quite.
There's an inv...]]></description><link>https://zahere.com/how-small-talk-can-boost-your-career-and-get-you-promoted</link><guid isPermaLink="true">https://zahere.com/how-small-talk-can-boost-your-career-and-get-you-promoted</guid><category><![CDATA[Career]]></category><category><![CDATA[career advice]]></category><category><![CDATA[communication]]></category><category><![CDATA[careers]]></category><category><![CDATA[Career Growth]]></category><category><![CDATA[#Adaptability]]></category><dc:creator><![CDATA[Zahiruddin Tavargere]]></dc:creator><pubDate>Mon, 26 Aug 2024 08:04:51 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1724659233281/2c218a0d-3ab9-4027-808b-374c546a4df1.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-video">Video</h2>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=gQxO8bX98Jc">https://www.youtube.com/watch?v=gQxO8bX98Jc</a></div>
<p> </p>
<p>Picture this: You're at work, head down, churning out projects like a machine. You're efficient, you're productive—you're doing exactly what's expected of you, right?</p>
<p>Well, not quite.</p>
<p>There's an invisible currency in the workplace that many of us overlook. It's not measured in ROI or any metrics.</p>
<p>It's measured in something immeasurable - your <strong>connections.</strong></p>
<p>You know, those seemingly pointless conversations about the weather, your weekend, or what you had for lunch.</p>
<p>For many of us in the engineering world, small talk feels, well… like a waste of time.</p>
<p>But here’s the thing—small talk is actually a big deal. And not just in some fluffy, feel-good way. It can genuinely boost your career. Seriously.</p>
<p>For over a year, I've been diving deep into the concept of adaptability. And I’ve realized something fascinating—being adaptive isn’t just one skill. It’s built on four core pillars</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1724657879028/6546e146-69cc-44d1-b82c-3f14f20b59af.png" alt class="image--center mx-auto" /></p>
<p>These four pillars are the foundation of thriving in our rapidly changing world.</p>
<p>But today, we’re zooming in on one misunderstood pillar: Communication.</p>
<p>And when we talk about communication in the context of adaptability, we’re talking about something much more fundamental—and ironically, much more casual.</p>
<p>Small Talk.</p>
<h2 id="heading-historical-context-of-small-talk"><strong>Historical Context of Small Talk</strong></h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1724658046827/835eb640-dab3-44ba-b7e1-756fd49a0a1b.png" alt class="image--center mx-auto" /></p>
<p><em>Let’s take a quick trip back to 1923. An anthropologist named Bronisław Malinowski coined the term ‘phatic communion’.</em> Just a fancy way of saying small talk.</p>
<p>Malinowski described it as a type of speech where connections are made simply through the exchange of words—nothing deep, nothing groundbreaking, just everyday chatter.</p>
<p>But here’s the interesting part: he recognized that this kind of talk, these seemingly meaningless words, actually create bonds between people. It’s a fundamental part of how human beings connect.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1724658144676/1e09a67d-d7c3-422e-bd23-42769539553c.png" alt class="image--center mx-auto" /></p>
<p>In a follow-up study, Coupland et al delved into the subtleties of phatic communion. They revealed that the simple phrase 'How are you?' is more than just a casual question—it's a social tool. It helps people navigate their roles in a conversation, express politeness, and even subtly influence the power dynamics at play.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1724658232994/53b332dc-480e-4ff4-901e-10fb23175d07.png" alt class="image--center mx-auto" /></p>
<p>I also came across an insightful HBR article titled 'The Surprising Power of Simply Asking Coworkers How They’re Doing.' According to a survey featured in the article, 39% of respondents said they feel the greatest sense of belonging when their colleagues check in with them, both personally and professionally."</p>
<p>Now that we have established the perceived benefits of small talk through research and content from respected sources, let me share my perspective on the subject of small talk</p>
<h2 id="heading-my-perspective"><strong>My Perspective</strong></h2>
<h3 id="heading-i-have-seen-great-leaders-do-small-talk">I have seen great leaders do small talk</h3>
<p>I've had the good fortune of working in three different industries, and in every single company, one thing I've consistently noticed is the most successful leaders don’t dive straight into business.</p>
<p>They use small talk to start meetings, to ease the tension in the room, and to create a comfortable atmosphere.</p>
<p>It’s a simple gesture, but it's incredibly powerful and empathetic. They could just as easily dive into the agenda, but they choose to connect first.</p>
<h3 id="heading-my-relationship-with-my-managers">My relationship with my managers</h3>
<p>When it comes to my relationship with managers, the connection didn’t stop at work. I’ve bonded with great managers over WWE, comic-book movies, and football.</p>
<p>These conversations showed my personality and gave me a glimpse into theirs. This connection wasn’t just for fun—it helped me manage up more effectively.</p>
<p>And I believe it helped my managers understand how to coach me and deliver tough feedback in a way that I could truly absorb.</p>
<h3 id="heading-working-with-colleagues">Working with colleagues</h3>
<p>Working with colleagues across the globe, especially in different cultural contexts, small talk has been a game-changer.</p>
<p>Whenever I met a new colleague from outside India, I made sure to spend the first 10 minutes or so just chatting about anything other than work.</p>
<p>This laid a solid foundation for the meeting and, as a bonus, often led to lasting friendships.</p>
<p>Now that we’ve established why small talk matters, let me share three key benefits that can genuinely impact your career.</p>
<h2 id="heading-3-key-benefits-of-small-talk">3 Key Benefits of Small Talk</h2>
<h3 id="heading-breaking-the-ice">Breaking the Ice</h3>
<p>Small talk is a great way to ease into more serious discussions. When you engage in light conversation, you create a comfortable environment for deeper conversations later on.</p>
<h3 id="heading-building-connections">Building Connections</h3>
<p>Engaging in small talk allows you to connect with colleagues on a personal level. This connection can lead to trust, making it easier for you to collaborate and work effectively together.</p>
<h3 id="heading-visibility">Visibility</h3>
<p>Regularly engaging in small talk increases your visibility within the organization. When people see you as approachable and friendly, they are more likely to think of you when opportunities arise.</p>
<p>So, how does all this translate into promotions?</p>
<h2 id="heading-the-impact-on-promotions">The Impact on Promotions</h2>
<h3 id="heading-networking">Networking</h3>
<p>The more connections you have, the more advocates you create within your organization. These advocates can vouch for your skills and work ethic when promotion opportunities arise.</p>
<h3 id="heading-being-top-of-mind">Being Top of Mind</h3>
<p>When you engage in small talk regularly, you keep yourself on the radar of decision-makers. They are more likely to think of you when considering candidates for promotions.</p>
<h3 id="heading-creating-allies">Creating Allies</h3>
<p>Building relationships through small talk can lead to mentorship opportunities. Mentors can provide guidance, support, and even recommend you for promotions.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>I’m not suggesting that small talk is the only key to promotions or the most important soft skill.</p>
<p>It’s important to recognize that this advice is based on the assumption that you’re already excellent at your job and fully deserving of that role.</p>
<p>But when the decision comes down to equally talented individuals, small talk can be the factor that sets you apart.</p>
<p>It’s the person who’s approachable, who connects well with others, and who’s seen as easy to work with that often comes to mind first.</p>
<hr />
<p>My popular article</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://zahere.com/how-to-build-an-ai-agent-without-using-any-libraries-a-step-by-step-guide">https://zahere.com/how-to-build-an-ai-agent-without-using-any-libraries-a-step-by-step-guide</a></div>
<p> </p>
<p>##</p>
]]></content:encoded></item></channel></rss>