Top Menu

Selecting the Right Speech Analytics Application 

Selecting the Right Speech Analytics Application

Selecting the Right Speech Analytics Application

By Donna Fluss

  Printer Friendly Format       View this document on the publisher’s website.

SPEECH ANALYTICS—also known as audio mining—is an application used to structure conversations and find embedded information, including customer insights, implicit needs and wants, and the root causes of issues. These applications can also be used to determine how staff complies with scripts and/or regulations. Some speech analytics applications can identify concepts and trends that organizations didn’t even know to watch for.When used in the call center, speech analytics enables managers and executives throughout the enterprise to address the issues that generate call volume and to identify competitive challenges and new revenue opportunities.

Speech Analytics Technology Overview

There are six functional components of a speech analytics application:

  1. Speech engine: this is the layer that does the initial analysis of the audio stream and converts the data into a file containing a series of phonemes, or a first pass at text transcript.
  2. Indexing and analysis layer (augmentation): this software improves the accuracy of the speech engine’s output. Its role is to make sense of the findings from the speech engine and to index it for further analysis, queries and ad hoc searching. This is where the tools import data from other telephony and servicing solutions.Most current R&D investment is focused on this area of speech analytics.
  3. Query engine: this is a user interface (UI) where end users define their queries and the output that they expect from the speech analytics tool.
  4. Search tool (criteria): this tool is used to conduct ad hoc searches on processed audio files or indices; it should be easy to use and allow nested filtering.
  5. Reports and dashboards: these are formats for clearly presenting system findings in a flexible, customizable and graphically appealing manner. They allow end users to drill down and filter their results.
  6. Business applications: these prepackaged applications can help users quickly realize benefits from speech analytics.Modules are available for root-cause analysis, customer retention, first call resolution (FCR), competitive intelligence, sales and marketing effectiveness, script adherence, collections effectiveness, and so on. See Figure 1.

Figure 1 – Speech Analytics Technology Building Blocks

Source: DMG Consulting LLC, August 2008

LVCSR vs. Phonetics

All speech analytics applications use an underlying speech engine to perform their initial analysis. The two primary types of speech engines are large vocabulary continuous speech recognition (LVCSR) engines and phonetic engines. LVCSR engines depend on a language model that includes a vocabulary/dictionary for speech-to-text conversion of audio files. The text file is then searched for target words, phrases and concepts. Phonetic-based applications separate conversations into phonemes, the smallest components of spoken language; they then find segments within the long file of phonemes that match a phonetic index file representation of target words, phrases and concepts. Interestingly, part of LVCSR processing involves breaking words down into phonemes, which is one reason why many of the LVCSR vendors now claim to also do a phonetic analysis.

As seen in Figure 2, phonetic engines are generally easier to deploy because they do not depend upon developing a language model or predefining all of the words, phrases and terms that are used by an organization. However, while phonetic-based tools can process large volumes of data more quickly and without a language model, LVCSR-based tools have proven to be more accurate in discerning the details of the reasons why customers call.

Figure 2 – LVCSR vs. Phonetic Engines

Source: DMG Consulting LLC, August 2008

Selecting the Right Speech Analytics Vendor

Enterprises should apply their standard technology selection best practices to a speech analytics acquisition. The best practices below assume that an organization has already approved an investment in speech analytics. But teams that first need to justify the investment should build a business case that includes a return on investment (ROI) analysis and submit it to the investment decision committee (or chief financial officer) for approval.

Since speech analytics is a crossfunctional application that provides benefits to sales, marketing, service, operations, R&D, and other departments, it’s important to make vendor selection a group decision. Although this slows the selection process, it helps ensure a successful implementation and will help speed the adoption rates throughout the enterprise. Here are the recommended steps for purchasing a speech analytics application.

  1. Create a cross-functional analytics project team with representatives from all affected departments. Schedule regular project reviews and status meetings.
  2. Identify and document the specific business problems or opportunities to be addressed—operational challenges, competitive situations, sales opportunities, retention, quality issues, etc. (The chosen analytics application should be able to address all of the areas, but by tackling one issue at a time, the organization is more likely to be successful.)
  3. Obtain a corporate sponsor for the project. The ideal candidate is a senior executive who is respected by the various constituents participating in the project team.
  4. Determine exactly which business issue/opportunity to improve/enhance first. This will require the project team to prioritize the needs of the various participants.
  5. Conduct a baseline analysis or assessment of the issue/opportunity in order to determine later whether the speech analytics application has achieved its goals.
  6. Compile a list of functional requirements based on interviews with all of the departments or executives that are going to use or support the application or its output. Turn this data into a formal request for information (RFI) document.
  7. Identify eight to 10 vendors that claim to have the functional capabilities, necessary resources and proven experience to accomplish the company’s goals. Issue the RFI to the vendors. Be sure to ask vendors for references from companies that have similar operating environments and needs, and be sure to ask for pricing information.
  8. Based on the RFI responses, calls to references and pricing, select five vendors to include in the formal selection process. This decision should be made by the team. (The speech analytics market is complex, and there are manytypes of vendors. See Sidebar)
  9. At this point, it’s a good idea to decide on the preferred deployment model. All of the speech analytics vendors license their products, and a growing percentage also offer hosted and managed-care offerings.Make sure that most of the five selected vendors offer the preferred deployment model. (Some companies put off deciding on the deployment approach.While this will increase vendor options, it may ultimately result in having to re-do the vendor selection process if the initially selected vendors do not offer the preferred acquisition alternative.)
  10. Create and issue a request for proposal (RFP) document to the five vendors selected to participate in the formal selection process. (To assist in this process, you might consider purchasing a report like DMG Consulting’s Speech Analytics Market Report. This report lays out the functional, technical, pricing and reference information required to make the right selection. It is available at
  11. Based on an assessment of the RFP responses, invite three vendors to come on-site and deliver a detailed presentation about their products. Conduct a phone call meeting with the vendors to prepare them for the on-site presentation. Tell them in advance about the company’s priorities and goals so that they know what they need to present.
  12. Build an ROI and total cost of ownership analysis to understand the financial impact and benefits of the three options.
  13. After meeting the vendors, seeing the demos, speaking to references, analyzing the RFP responses, and conducting the financial analysis, select two top contenders and prioritize a favorite. (It’s always good to select two contenders in order to establish a stronger negotiating position.)

This is where most technology selection processes end. The project team would then negotiate contract terms and conditions with the vendor of choice. If this vendor is unreasonable or not willing or able to meet the required terms and conditions of the purchasing company, the selection team should move on to the secondchoice vendor. Since speech analytics is still relatively new, DMG suggests that users who have the budget and time conduct a pilot before committing to a large investment in speech analytics. The project team should select one application and treat the pilot as if it were a full implementation because, if successful, the application can be rolled out to other areas.

If the pilot is completed on time and on budget, the results are as expected (or better), and the vendor has proven to be a good partner, the application should be phased in to other operating areas. If the pilot is not successful, the project team should re-evaluate its selection based on the actual results and either bring in different prospective vendors for a closer look or select new vendors based on the information gathered during the pilot. At this point, a second pilot should not be necessary, as the company now has firsthand experience on which to base its decision. Conducting a pilot is a great way to increase the chances for success, even though most vendors would rather avoid this step.

Final Thoughts

Speech analytics is one of the fastest growing applications in the call center market because it contributes benefits to the enterprise’s bottom line and it helps give managers insights that can reduce operating expenses, increase revenue, decrease customer attrition and improve the customer experience. The typical payback period from a successful speech recognition implementation is six to 12 months. The challenge is that a speech analytics implementation is not easy and requires ongoing support and fine-tuning to realize its full benefits.

Sidebar – Overview of Speech Analytics Vendors

There are 23 vendors in the speech analytics market, and more are expected to join the market. The vendors fall into the following categories:

  • STANDALONE SPEECH ANALYTICS PROVIDERS: vendors that sell only speech analytics suites.
  • EMOTION DETECTION PROVIDERS: vendors that sell emotion detection software, which is considered part of the broader speech analytics market.
  • WORKFORCE OPTIMIZATION (WFO) SUITE PROVIDERS: vendors that sell a suite of management applications intended to improve the performance of contact centers. These suites include a speech analytics application.
  • STANDALONE ENGINE PROVIDERS: vendors that sell only the underlying speech analytics phonetic or LVCSR engine.
  • OTHERS: vendors that fall into a different category, such as a call center infrastructure vendor or analytics vendor that provides speech analytics among its offerings.

, ,