Jump to content

ChatGPT vs. Bard: A realistic comparison


aum

Recommended Posts

Let’s see how Bard does vs. ChatGPT, without preconceptions or hype. One person’s totally unscientific, anecdotal, but realistic field experiment.

 

Apricot is an “AI-powered” product in a few ways. Most directly, the Apricot app uses a large language model (LLM) service for on-demand content summaries. Behind the scenes, I use LLM chat bots to write drafts of both new code and marketing content, like LinkedIn posts and Tweets.

 

So far, I’ve used ChatGPT for this (GPT-3.5 for the API, GPT-4 in the chat window), but Google made Bard generally available this week and upgraded it to use its new PaLM 2 LLM and the hype men came out in predictable force.

hype_tweet.png

A Bard hype tweet. One example of many.

Many of the examples in these hype tweets don’t stand up to scrutiny, in terms of being meaningful for everyday use.

 

  • Sure, Bard can export code directly to Google Colab, but Ctrl-C, Ctrl-V into a notebook isn’t that hard either.

  • Yes, Bard can get yesterday’s stock returns, but so can Yahoo Finance.

  • Sometimes the examples just don’t replicate. I asked Bard for specific SEO tips for Apricot’s landing page and it gave generic suggestions. Fine, but nothing special.

  • Voice prompts are probably very helpful to some people, but not a deciding factor for most.

 

I don’t want to fall behind the pack, though, so let’s put Bard to the test vs. ChatGPT for my actual Apricot-related use cases.

Some context

This is not a scientific study, clearly. Once upon a time, I enjoyed doing controlled, in-depth, technical comparisons of ML models, but those days are past. In this post, I’m going to take about an hour to explore a few use cases, make a decision, and move on to the rest of my long to-do list.

The competitors

  • OpenAI ChatGPT, with the GPT-4 model, in the chat window at https://chat.openai.com/. I pay about $20/month for this.

  • Google Bard, in the chat window at https://bard.google.com/. I’m not sure what the model is under the hood, so minus 1 point for Google. But it’s free, so plus 1 point and we’re back to even.

Challenge and Results

1. Summarize long-form content

Benn Stancil writes a great data science newsletter but his headlines are super vague. Take Stancil’s post The new philosophers as an example—I have no idea what’s its about from the title alone. Let’s ask our AI assistants to summarize the content, so we know if it’s worth our time to read.

 

My prompt to each system is the same as the Apricot App’s prompt for on-demand user-facing summaries:1

 

Summarize the following content in one sentence: “““{text}”””

ChatGPT says:

chatgpt_summary.png

 

Bard says:

Well, hang on a sec…Bard truncates the input, and it’s not obvious where. Let’s hope Stancil didn’t save the punchline for the very end…Kinda surprising, to be honest, when other tools are expanding the context window dramatically.

At any rate, Bard’s answer is:

 

bard_summary.png

 

Side note: the View other drafts option is intriguing but I’m not sure it makes sense, from a product perspective. Wouldn’t it be better to iterate with a new prompt if I don’t like the response? I don’t have the patience to read three answers every time I use the tool.

 

Winner: ChatGPT

 

Bard’s summary is too terse; it conveys very little information. ChatGPT, on the other hand, captures explicitly that the next evolution of the data management tools will be about AI specifically and could be a fundamental, discontinuous change.

2. Write a Python function

Users coming to Apricot from an RSS reader can import their feeds in bulk with an OPML file. When I set out to build this feature, I didn’t know much about OPML so I asked ChatGPT to write the core parsing code.

Write a Python function to parse an OPML file.

ChatGPT’s answer:

import xml.etree.ElementTree as ET

def parse_opml(file_path):
    # Parse the OPML file
    tree = ET.parse(file_path)
    root = tree.getroot()
    
    # OPML should have a 'body' child, which contains 'outline' elements
    body = root.find('body')

    outlines = []

    for outline in body.findall('outline'):
        # Each outline should have 'text' attribute
        text = outline.get('text', '')
        
        # Append it to our list
        outlines.append(text)

    return outlines

Bard’s code:

def parse_opml(filename):
  """Parses an OPML file.

  Args:
    filename: The path to the OPML file.

  Returns:
    A list of dictionaries, where each dictionary represents an outline item.
  """

  with open(filename, 'r') as f:
    xml = f.read()

  # Parse the XML into a tree.
  tree = ET.fromstring(xml)

  # Get the outline items.

  outline_items = []
  for outline_item in tree.findall('outline'):
    outline_item_dict = {}
    outline_item_dict['title'] = outline_item.find('title').text
    outline_item_dict['url'] = outline_item.find('xmlUrl').text
    outline_items.append(outline_item_dict)

  return outline_items

OK, this is not the biggest thing, but why on earth does Bard use indentation with 2 spaces instead of the standard 4? This drove me up the wall, so I tried all sorts of prompts to get Bard to use indentation with 4 spaces but they all failed. There must be some sort of post-processing thing going on, but that’s strange because Google’s official style guide says to use 4 spaces.

 

Moving on.

 

Frankly, both of these code snippets are pretty bad. ChatGPT’s code only extracts the text field of each entry in the OPML file, but there’s typically other important stuff like URLs.

 

Bard’s code looks better at first glance because it does try to extract multiple fields (title and xmlUrl) from each entry. But it fails to run because it uses the ElementTree class without importing it first. Once that’s fixed, it runs but returns an empty list because it fails to understand that the findall method only searches direct children of the current node, but the individual outline elements are nested two levels deep.

 

It’s no surprise that the code isn’t perfect on the first pass; what matters is how quick and easy it is to iterate to a good version. For ChatGPT, I extended the conversation to say

Please extract all fields from each outline element of the input file into a Python dict.

and it updated the code to an elegant, correct answer.

 

For Bard, I replied

Please update the code to parse outline elements in the body of the input document.

and it did fix the known problem. But the new code still doesn’t work because it fails to recognize that for OPML outline elements, the data is stored as tag attributes, not encapsulated text. Sigh. 😞

 

Winner: ChatGPT

3. Social media marketing tactics

As a bootstrapped, solo project, Apricot has relied so far on social media and content marketing. It’s about time to make a Show Hacker News post, which—done well—can generate a big spike in traffic and sign-ups. Let’s ask our AI assistants to how to make the post successful.

Bard first this time:

 

bard_showhn.png

 

And now ChatGPT, with the same prompt:

 

chatgpt_showhn.png

 

Unlike the code challenge, this time both answers feel pretty good! 🥳

 

Both LLMs include tips about both content and logistics, as requested, both seem to understand the Hacker News culture, and both give good, “correct” advice. My only complaint is that both are very generic. Compare to this post on Indie Hackers, which gives a very specific recipe for posting successfully on Hacker News2.

 

Winner: tie

4. Write a customized Tweet

Continuing in the same vein, I often use ChatGPT to write first drafts of Apricot marketing content. The difference with this challenge is that the content needs to be about Apricot specifically or even a particular blog post topic. This time I’ll use a bit of prompt engineering to get the tone I want.

 

Let’s do Bard first again:

bard_tweet_v1.png

 

Yikes! I didn’t think this would be a particularly sensitive request. Let’s click that Regenerate drafts button a few times…

 

bard_tweet_v2.png

 

And now ChatGPT:

chatgpt_tweet.png

 

Looks great, but it doesn’t fit within the character limit! I had to remove the hashtags to make it fit.

 

Winner: tie

 

I like ChatGPT’s emojis and hashtags but Bard’s text-only version is a fair response to my guidance of a “direct, no-spam” brand. Bard refused to answer at all at first, but ChatGPT’s response was too long. Both tweets are live, so you can vote yourself. 😀

 

Bard tweet

ChatGPT tweet

And the overall winner is….

ChatGPT.

 

ChatGPT beat Bard pretty convincingly in summarization and Python coding. That’s worth $20/month to me, at least for now, despite the fact that they tied in Tweet and marketing tactics generation.

 

I’m happy there’s competition and I suspect Bard will improve quickly—I look forward to re-doing this test again soon!

Footnotes

  1. The text of the article goes in the {text} slot, of course.↩︎

  2. In short, use a tool (like Apricot!) to find news articles in your area of expertise as soon as they’re published and post them quickly.

 

Source

Link to comment
Share on other sites


  • Replies 1
  • Views 803
  • Created
  • Last Reply

Top Posters In This Topic

  • Akaneharuka

    1

  • aum

    1

Popular Days

Top Posters In This Topic

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...