TL;DR
I led the program and development of a scalable AI-driven CX insights engine that transformed 200,000 annual customer call transcripts into structured, topic-based insights using Gong, Python, Tableau, and AI/NLP tools such as BERTopic. The program identified emerging customer signals, flagged product and service issues, and delivered executive-ready narratives that drove decision making across Product, Sales, CS, and Support.
Last year, I was leading a CX insights program that needed to do something that most companies struggle with: extract real meaning from a massive backlog of customer calls.
Our company boasts over a billion members, with many thousands of corporate clients leveraging our hiring, advertising, and business tools. We had more than 200,000 unique calls recorded in Gong each year: Sales calls, customer success calls, support escalations, onboardings, renewals, etc. All across multiple segments and product lines.
The transcripts were there, but when teams tried to pull insights from this qualitative backlog it was slow, siloed, and hard to compare apples-to-apples. It seemed that everyone had their own spreadsheet, summary, or interpretation of this call data.
Our internal stakeholders wanted to know:
- “What are our customers struggling with?“
- “Why are some of them churning?“
- “What patterns are we missing?“
The feedback they needed was rich, but buried. We needed a way to hear what customers were saying at scale and in-time to act on these signals.
The problem: more input than insight
On the surface, we had everything that we needed: transcripts, metadata, filtering options. But reviewing them wasn’t feasible, especially across multiple lines of business. Even a modest sample of 50 calls could take hours to scan and tag, and such manual analysis is prone to feeling anecdotal or subjective.
What we didn’t have was a scalable way to:
- Detect patterns across thousands of calls
- Track how those patterns change over time
- Quantify customer sentiment by theme or product line
- Deliver insights fast enough to be truly actionable for the business
This wasn’t a listening problem. It was a signal extraction problem.
The Solution: turn calls into chunks, and chunks into topics
To make sense of this scale of conversation data, we needed a system that could turn raw transcripts into structured, interpretable data signals. The solution was to build a python-based AI pipeline that combined preprocessing, topic modeling, and sentiment scoring to surface the patterns buried in thousands of hours of Gong calls.
Here’s how it worked:
- Preprocessing
Clean the Noise and Structure the Input
The first step was to prepare the transcript data for analysis. Each call transcript was filtered to include only the customer side of the conversation. This helped focus the analysis on customer intent, friction, and perception of issues.
Additionally, non-essential portions of the conversation were removed. Greetings, small talk, and filler words were removed to reduce noise and elevate the meaningful signals.
Finally, the remaining transcript was segmented into 200-word chunks, allowing for more granular analysis and improved model performance. This pre-processing phase created a clean, high-signal dataset optimized for BERTopic and sentiment modeling. - Topic Modeling
Grouping Verbatims into Meaningful Topics
With structured transcript chunks in place, the next step was to apply BERTopic for topic modeling. This model groups similar patterns of customer transcript into topics, representing a recurring theme or issue that occurs across many calls.
For each Line of Business, the model typically surfaced around 100 distinct topics. Each topic was defined by a set of frequently occurring words or similar like-words (such as “data sync,” “mapping,” and “CRM”) that captured the core idea of a topic cluster. These topic labels made it easy to quickly interpret what customers were talking about across a broad sample of calls.
This approach provided a scalable and repeatable way to track real customer feedback at volume–without relying on predefined categories or manual tagging. - Topic Sentiment
Assigning Meaning to Customer Emotion
Once each verbatim has a topic identified, the next layer of analysis focuses on sentiment. Each customer transcript chunk was scored using a combination of BERT and TextBlob, which converted the natural language into a sentiment value between decimal -1 to +1 indicating negative, neutral, or positive expression.
By aggregating these scores across all the chunks within a topic, a net-sentiment score could be calculated. This provided a more complete picture of what the customer was talking about and how they felt about the topic.
For example:- A topic related to “support ticket response times” might show rising volume with increasingly negative sentiment
- A topic related to “new AI features” might emerge with high volume but positive sentiment
The combination of these three steps, preprocessing, topic modeling, and topic sentiment created a structured, data-backed way to surface emerging CX drivers, quantify their impact, and track how those conversations evolve over time. We have now transformed those +200,000 calls into reportable signals.
The Outcome: From Raw Transcripts to Strategic Customer Signals
Once preprocessing, topic modeling. and sentiment scoring were complete, the next step was to enrich each theme so analysis of these topics could be easily accessible to our internal stakeholders. Essentially, we needed to build the “So, what?” for each topic.
Every month, for every topic identified, we generated the following outputs from AI prompts made via Python:
- Category Assignment: each topic was grouped into a broader customer-facing category such as product feedback, contracts & pricing, service experience, competition, etc. This made it easier to align themes with internal team structures and business functions.
- Theme drivers: GPT was used to analyze the most recent call verbatims within each topic to extract the top 4-8 subtopics. These drivers were a subset of phrases or themes that emerged across these topics.
- AI-generated summaries: a brief 2-3 sentence executive summary was created for each topic to demonstrate the main customer concerns and talking points.
With these outputs in place, we used Python to track and compare key metrics for each topic and its associated drivers over time. This included:
- Month-over-Month and Quarter-over-Quarter changes in volume and sentiment
- Standard deviation and p-value calculations to flag topics with stat sig movement
- Early detection of emerging trends or signals of success from product/service interventions
This made it possible to automatically detect across +200,000 calls when:
- A new issue was surfacing
- An existing topic was worsening or improving
- A product or service fix was shifting customer volume or sentiment
Delivering the Insights: Dashboard + Monthly Signals Report
The final outputs of the process were delivered in two recurring formats that served different but complementary goals for the company:
Self-Serve Tableau Dashboard
A Tableau dashboard was updated monthly with fresh Gong topic data. This dashboard displayed:
- Topic-level volume and sentiment trends
- Category-level filters (e.g. product feedback, service issues)
- AI-generated summaries and driver lists
The dashboard was broadly used across the business from CX to Product to GTM to Support. It gave internal stakeholders the ability to explore topics relevant to their domain, view supporting quotes, and monitor changes over time. It also provided a shared, standardized view of what customers were saying that was grounded in data and not anecdotes.
Monthly CX Report: Customer Signals from Gong
In parallel, a curated Customer Signals report was published each month to highlight the 2-3 most meaningful or emerging topics per line of business. This report was built from the underlying data but elevated the signal with narrative structure and clear prioritization:
- The topic summary from that month
- A sample of customer quotes illustrating each topic driver
- Volume and sentiment data to demonstrate scale
- A short interpretation, tagged stakeholders, and suggested next steps
This report was delivered directly to product and service managers and functional leads. It had visibility among our Executive Leadership. It serves a critical function to ensure that our stakeholder teams are aware of what is changing in the customer experience and have real-time evidence to inform their roadmap, messaging, or operational strategy.
Final Thoughts
This process transformed Gong from a passive call recording system into an active CX intelligence layer at our company. We moved from pulling scattered quotes and manually tagging calls to delivering trend-based, sentiment-weighed insights that product, CS, support, and CX teams could trust and act on.
- The models and dashboards made insights scalable.
- The narratives and summaries made them accessible.
- The signal detection layer ensured we acted on emerging insights before they escalated.
If you are sitting on a mountain of conversation data and are wondering what your customers are really telling you, the answers are likely already there. You just need the right process to listen and surface what truly matters the most.
While this methodology was built around Gong transcripts, the same approach applies to other qualitative sources: NPS surveys, support ticket interactions, chat transcripts, etc. It is a framework for making sense of unstructured feedback wherever it lives.
And with a slight shift in focus, the same pipeline can be applied to the employee side of the conversation rather than a customer focus. This offers a powerful way to monitor how consistently your team is delivering key messages, reinforcing value propositions, and aligning to your go-to-market strategy.
Listening is just the starting point. The real impact comes from building systems that transform raw conversations into meaningful, actionable signals—at scale.
Timothy Brown, CCXP
