Conference Program

Now in our fifth year, Text Analytics Forum is a place for sharing ideas and experiences in text analytics from beginner to advanced developers. We cover all aspects and approaches to text analytics including machine learning and AI, semantic categorization rules, build your own to advanced development-testing platforms, and human-machine hybrid applications.

The last few years have seen some fantastic advances in AI in both theory and practice. However, the vast majority of those advances have been in data and pattern-based AI not in text. Even the much-hyped GPT-3 from Open AI largely treats text as sets of complex patterns without any real understanding of the meaning of the words in it’s truly gigantic “training set”. Three-year-old children continue to outperform the best AI programs in learning language at the meaning level – learning new words from a single exposure rather than billions of examples.

Text analytics to the rescue! AI is only as smart as the content on which it is trained and text analytics is the best tool to create smarter training sets. It is also the best tool to create smart and useful applications of all kinds—search, customer and business intelligence, sentiment and social media analysis, and new applications no one has thought of—yet.

Text Analytics Forum 2021 will showcase how enterprises are using text analytics and AI (or other techniques) to create really useful applications, enhance taxonomies and ontologies, and make existing applications smarter. 

Programming includes practical how-to’s, fascinating use cases that showcase the power of text analytics, new techniques and technologies, and new theoretical ideas that drive text analytics to the next level.

 

Monday, Nov 15

Optional Workshop/Conference Day

 

Optional Workshop/Conference Day

09:00 AM2021-11-152021-11-15

Monday, November 15: 9:00 a.m. - 5:00 p.m.

Text Analytics Forum 2021 is a part of a unique program of four co-located conferences this November — Text Analytics Forum,KMWorld,Taxonomy Boot Camp, andEnterprise Search & Discovery. Please take an opportunity to explore these events and their content, then choose a Platinum Pass to gain full access to these distinct, but synergistic, conferences.

 

Grand Opening Reception in the Enterprise Solutions Showcase

05:00 PM2021-11-152021-11-15

Monday, November 15: 5:00 p.m. - 6:30 p.m.

Join us for the Enterprise Solutions Showcase Grand Opening reception. Explore the latest products and services from the top companies in the marketplace while enjoying small bites and drink. Open to all conference attendees, speakers, and sponsors.

 

Tuesday, Nov 16

Optional Conference Day

 

Optional Conference Day

08:30 AM2021-11-162021-11-16

Tuesday, November 16: 8:30 a.m. - 5:00 p.m.

Text Analytics Forum 2021 is a part of a unique program of four co-located conferences this November — Text Analytics Forum,KMWorld,Taxonomy Boot Camp, andEnterprise Search & Discovery. Please take an opportunity to explore these events and their content, then choose a Platinum to gain full access to these distinct, but synergistic, conferences.

Wednesday, Nov 17

Keynotes

 

Keynotes

08:30 AM2021-11-172021-11-17

Wednesday, November 17: 8:30 a.m. - 10:00 a.m.

Check back soon for keynote details.

 

Welcome and Keynote

10:45 AM2021-11-172021-11-17

Wednesday, November 17: 10:45 a.m. - 11:30 a.m.

What are the current and future trends for the field of text analytics? Join program chair Tom Reamy for an overview of the conference themes and highlights and a look at what is driving the field forward. This year’s main theme is the mutual enrichment of text analytics and AI as expressed in a growing variety of applications. Knowledge graphs continue to produce great applications. We also continue the exploration of machine learning and rules-based approaches and how people are combining them to get the best of both worlds. The talk wraps up with a look at current and future trends that promise to dramatically enhance our ability to utilize text with new techniques and applications.

Speaker:

, Chief Knowledge Architect, KAPS Group, LLC, USA

 

Wednesday, Nov 17

Track 1: TA Development

 

State of the Art—Text Analytics Software

11:45 AM2021-11-172021-11-17

Wednesday, November 17: 11:45 a.m. - 12:30 p.m.

Recent Advances in Natural Language Processing
11:45 a.m. - 12:30 p.m.

Text analytics relies on natural language processing (NLP), powered by machine learning, semantic, and real-world knowledge systems. The NLP state of the art is advancing rapidly. This talk brings you up-to-date on recent years' advances in distributional models, vector-spaces embeddings, representations, and transformers: BERT (and its descendants), GPT-3, and all that. We discuss special tasks such as question-answering, conversational systems, natural language generation, and emotion AI. Naturally you'll want to know how, so we look at leading open-source and commercial NLP options and touch on data, bias, and ethics concerns. In sum, this talk takes a comprehensive look at what works in NLP and what to expect in days to come.

Speaker:

, Principal Consultant, Alta Plana Corporation

 

Text Analytics Development Process—Autocategorization

01:30 PM2021-11-172021-11-17

Wednesday, November 17: 1:30 p.m. - 2:15 p.m.

Introducing Auto-Classification at a Major National Foundation
1:30 p.m. - 2:15 p.m.

In fall 2019, the Robert Wood Johnson Foundation (RWJF) released a new enterprise taxonomy model for classification of grants and key information related to its grantmaking. Since rollout, a critical next step has been the introduction of auto-classification to augment RWJF's current manual taxonomy application process. To begin this effort, RWJF initiated a comprehensive text analytics software evaluation that led to a working prototype for the Topics facet of its taxonomy and used it to successfully reprocess and retag a large volume of grants that suffered from past over-indexing. The evaluation included a POC that built autocategorization rules which, utilizing a content structure model, achieved 95%-plus accuracy. These rules formed the basis for the first application. Informed by this effort, RWJF is now in the initial stages of integrating text analytics into grant-related applications and preparing to extend the configuration to additional taxonomy facets so it can begin moving forward with auto-classification. This includes getting staff ready for the changes and exploring a growing range of potential use cases. Join RWJF's knowledge management officer, Ari Kramer, and Tom Reamy of KAPS Group for a discussion about the approach RWJF is taking to auto-classification, key challenges, and lessons learned, as well as how RWJF is planning to leverage text analytics in the near- and long-term.

Speakers:

, Knowledge Management Officer, Robert Wood Johnson Foundation

, Chief Knowledge Architect, KAPS Group, LLC, USA

 

Text Analytics Development Process—Enterprise Workflow

02:30 PM2021-11-172021-11-17

Wednesday, November 17: 2:30 p.m. - 3:15 p.m.

From Data Collection to Action: High-Performing Experience Text Analytics at Work
2:30 p.m. - 3:15 p.m.

Most businesses and organizations realize the importance of understanding their constituents' (stakeholders, employees, customers, consumers, clients, etc.) experiences and opinions. Fewer, however, have a deep understanding of how to integrate speech and text information into their strategic decision making. Forsta shares its experience by breaking down the flow of information and identifying the best practices and traps in each step. Bushell covers solicited and unsolicited sources, storage and organization, preprocessing and categorization, and analysis and taking actions. She shows you how to tackle each step effectively and efficiently so you get the most out of your experience data, reduce users' decision-making risk, and increase your analytics cred.

Speakers:

, Head of Global Analytics, Forsta

, Principal Analytics Consultant, Forsta

 

Ask the Experts Panel

04:00 PM2021-11-172021-11-17

Wednesday, November 17: 4:00 p.m. - 5:00 p.m.

A panel of four text analytics experts answers questions that have been gathered before and during the conference, as well as some additional questions from the program chair. This has been one of our most popular features in previous years, so come prepared with your favorite questions and be ready to learn.

 

Wednesday, Nov 17

Track 2: Technical

 

Knowledge Graphs

11:45 AM2021-11-172021-11-17

Wednesday, November 17: 11:45 a.m. - 12:30 p.m.

Alert Detection Generating for Critical Geo-Chronolocated Events
11:45 a.m. - 12:30 p.m.

In response to tragic events, GEOLSemantics offers its technology to launch rescue missions and other interventions through generating alerts for critical geo-chronolocated events. This combined technological solution (mixing NLP and AI) can collect information on past, current, or future events that are identified through an ontology about disasters, information collection, and security, as well as accidents and other critical incidents. It is currently being successfully applied in a Smart City project to monitor threats, hazards, and critical events. The system monitors traditional and social media platforms and can locate and date events. It aims to extract information on important events from the time the first related message appears and before the events are trending. It can assign an accurate position by using metadata and elements from messages. Social media sources call for automatic speech transcription when dealing with videos and social network languages influenced by internet slang. In this context, NLP, and more precisely NLU, is essential to overcome language ambiguities and properly identify critical dates, locations, and events. The solution covers critical alert monitoring through geo-chronolocated events such as accidents, fires hazards, flooding, and other disasters, as well as violence and spontaneous events. An event involves three elements: action, place, time. By using an ontology combined with extraction rules, each element can be extracted based on the semantics, rather than just a pattern, so the information related to the event can be enriched. GEOLSemantics converts information gleaned from text into geographic coordinates by using a GIS database to locate events on a map.

Speakers:

, Scientific Director, GEOLSemantics

, Team Development Director, GEOL Semantics

 

Enriching Knowledge Graphs—A Two-Way Street

01:30 PM2021-11-172021-11-17

Wednesday, November 17: 1:30 p.m. - 2:15 p.m.

Knowledge Graphs—The New Model to Integrate Text & Data
1:30 p.m. - 2:15 p.m.

Organizations recognize that their internal data and content assets are key to establishing enterprise AI initiatives, making their internal workforce more efficient and more effective in serving their sector. However, they often feel restricted by stubborn legacy systems and old-school approaches to modeling and utilizing assets. Knowledge graphs offer a new approach to tackle the challenge of integrating text and data. Leveraging structured data sources to describe unstructured text enables better classification and discovery of that content. Conversely, extracting semantic meaning from unstructured content to enrich structured data makes data more accessible and prepared for a variety of use cases. This talk discusses the best practices for semantic data modeling and architecture design, including how to establish a use case, select core technologies, iteratively enrich your knowledge graph, and apply the model to downstream applications.

Speakers:

, COO, Enterprise Knowledge LLC

, Senior Technical Analyst, Enterprise Knowledge, LLC

Semantic AI—How a Knowledge Graph Brings Quality to Machine Learning
1:30 p.m. - 2:15 p.m.

We all know that machine learning algorithms can only learn from historical data, but they cannot derive new insights from it. We also know that when implementing explainable AI (XAI) systems, it is crucial to make their decisions explainable and transparent, incorporating new conditions and regulatory frameworks quickly. It's in the nature of machine learning algorithms that the basis of their calculated rules cannot be explained; they are just “a matter of fact.” Including knowledge graphs as a prerequisite to calculate not only rules, but also corresponding explanations can be a solution to this. A semantic AI architecture is based on machine learning as well as knowledge graphs, where data analysts and knowledge scientists work together, making use of a knowledge graph to directly extract data that can be quickly transformed into structures for analysis. The results of the analyses themselves can then be reused to enrich the knowledge graph. The semantic AI approach thus creates a continuous cycle in which both machine learning and knowledge scientists play an integral part. Knowledge graphs act as an interface in between, providing high-quality linked and normalized data This talk outlines the building blocks of a semantic AI architecture and shows some concrete examples.

Speaker:

, COO, Semantic Web Company

 

Graph Neural Networks

02:30 PM2021-11-172021-11-17

Wednesday, November 17: 2:30 p.m. - 3:15 p.m.

Graph Neural Networks for Text Classification & Relation Extraction
2:30 p.m. - 3:15 p.m.

Enterprises are subscribed to the power of modeling data as a graph and the importance of building knowledge graphs for customer 360 and beyond. The ability to explain the results of AI models and produce consistent results from them involves modeling real-world events with the adaptive schema consistently provided via knowledge graphs. Graph neural networks (GNNs) have emerged as a mature AI approach used by companies for knowledge graph enrichment via text processing for news classification, question and answer, search result organization, and much more. A graph can represent many things—social media networks, patient data, contracts, drug molecules, etc. GNNs enhance neural network methods by processing the graph data through rounds of message passing; as such, the nodes know more about their own features as well as neighbor nodes. This creates an even more accurate representation of the entire graph network. This presentation discusses the advantages of GNNs for text classification and relationship extraction.

Speaker:

, CEO, Franz Inc.

 

Ask the Experts Panel

04:00 PM2021-11-172021-11-17

Wednesday, November 17: 4:00 p.m. - 5:00 p.m.

A panel of four text analytics experts answers questions that have been gathered before and during the conference, as well as some additional questions from the program chair. This has been one of our most popular features in previous years, so come prepared with your favorite questions and be ready to learn.

Thursday, Nov 18

Keynotes

 

Keynotes

08:30 AM2021-11-182021-11-18

Thursday, November 18: 8:30 a.m. - 10:00 a.m.

Check back soon for keynote details.

 

Thursday, Nov 18

Track 1: Tools & Techniques

 

TA Use Cases

10:15 AM2021-11-182021-11-18

Thursday, November 18: 10:15 a.m. - 11:00 a.m.

Tools for Enhancing Legacy Knowledgebases
10:15 a.m. - 11:00 a.m.
Speaker:

, CEO, Applied Knowledge Sciences, Inc.

 

TA Tools – Video

11:15 AM2021-11-182021-11-18

Thursday, November 18: 11:15 a.m. - 12:00 p.m.

Text Analytics for Non-Textual Assets
11:15 a.m. - 12:00 p.m.

Digital assets such as video, audio recordings, and still images are rich sources for content analysis. However, in order to apply text analytics methods, we need to rely on textual representations of these non-textual objects. This session discusses how we can leverage machine-generated transcripts and human-entered metadata as the basis of analysis. It looks at how we can build an annotation pipeline using APIs and services as well as at some outcomes and lessons learned from a business use

Speaker:

, Corporate Taxonomist, IBM

Low-Image Quality Documents—A Hybrid Approach to Automation
11:15 a.m. - 12:00 p.m.

Manual redaction of sensitive information within historical document images is an important and mind-numbing task. In the past, automation was confounded by poor image quality and high document variability. The challenge stems from the spectrum of quality of the source documents, where the biggest concerns lie with those that are not digitally born. From this aspect, one must develop a robust mechanism for grading the source document for quality as well as an infrastructure to house the document components for feature or rule development to occur. Given the wide array of document types that could be consumed, it is from this stage that classification can take place utilizing purpose-built models to understand what next steps need to be taken. Finally, from this stage, we can employ other rule sets or models to achieve the task of processing automation. This presentation explores these various stages to provide insights into lessons learned and the approaches that were found that provided the greatest benefits.

Speaker:

, CEO, Blackmarker

 

Keynote Luncheon

12:00 PM2021-11-182021-11-18

Thursday, November 18: 12:00 p.m. - 1:00 p.m.

Check back soon for keynote details.

 

TA Techniques – Vector Space Representation

01:00 PM2021-11-182021-11-18

Thursday, November 18: 1:00 p.m. - 1:45 p.m.

Mapping Employee Knowledge: A Comparison of Two Word-Embedding Algorithms
1:00 p.m. - 1:45 p.m.

The Inter-American Development Bank is a multilateral development institution with a mission to work with governments and other actors to address development challenges in Latin America and the Caribbean. This mission motivates the Bank to be constantly strengthening its capacity by generating and acquiring knowledge from its operations in 27 countries, as well as from external sources. These efforts have included using deep learning to create a multilingual language model to map employee expertise and contextualize user queries for improved relevance of search results. Still, even in this focused field, there are numerous tools that can be used. This presentation will compare performance using a word-level vector-embeddings algorithm (word2vec) and a character-level vector-embeddings algorithm (fasttext) to power search for people within the organization and reflect on future applications.

Speaker:

, Senior Knowledge Management Specialist, Knowledge, Innovation, and Communication Sector, Inter-American Development Bank

 

TA Applications

02:00 PM2021-11-182021-11-18

Thursday, November 18: 2:00 p.m. - 2:45 p.m.

Taxonomies & Text Analytics for Recommendation Systems
2:00 p.m. - 2:45 p.m.

Recommendation systems go beyond the limitations for search to get the right information to the right people by providing suggestions, such as for content, products, opportunities, connections, etc. A knowledge-based recommendation system, making use of a knowledge graph and text analytics, has advantages over other recommender technologies. This session presents a prototype recommendation system, an "HR Recommender" (for jobs, projects, and people to connect to), and explains how it is built. It is based on a semantic model of taxonomies and an ontology, content that is text-mined, algorithms for calculating similarities, a search index, and a front-end user interface. A demo of the recommender application and a demo of a tool for text mining and taxonomy/ontology modeling behind it are presented.

Speaker:

, Data and Knowledge Engineer, Author, The Accidental Taxonomist, Semantic Web Company

An NPS Survey Analysis Powered With Natural Language Processing Algorithms
2:00 p.m. - 2:45 p.m.

Today's corporate organizations have widely adopted the Net Promoter Score (NPS) survey as a tool to assist in measuring and better understanding overall customer satisfaction as well as brand health. However, limited insights have been drawn from the survey text comments beyond the calculated numerical NPS score. Attempts of manually encoding the NPS survey text comments suffer from the drawbacks of time-consuming, inconsistent categorization of topics and sentiments. Text analytics enables companies to gold-mine rich insights from the open-ended free-text comments. We use advanced NLP (natural language processing) algorithms to leverage the voice of customers. We first use a topic modeling algorithm to categorize each comment. Second, we run an aspect-based sentiment analysis, as one comment may cover multiple topics. Lastly, we join the NLP data (topics and sentiment scores) with customers' attribute data and conduct a clustering analysis to create customer segments to understand who, what, and how they feel about the company's products and services. This NPS survey analysis, powered with advanced NLP algorithms, generates actionable insights for the business to improve customer satisfaction and brand health.

Speaker:

, Director of Applied AI, Walden University

 

TA Advanced Tools & Techniques

03:00 PM2021-11-182021-11-18

Thursday, November 18: 3:00 p.m. - 3:45 p.m.

Early Detection of Emerging Trends
3:00 p.m. - 3:45 p.m.

Medical library science specialists, voice-of-customer program managers, politicians, financial and insurance managers, competitive intelligence analysts—all could benefit from early detection of emerging trends. However, timely discovery of important new issues is not an easy task. It requires processing huge amounts of text data to identify previously unknown but growing trends, which frequently reveal themselves as a weak signal on the background of strong noise of already-known issues. We discuss the technological foundation and the methodology for deploying a system that automates early detection of new issues. Also, we provide a live demonstration of a system that automates the discovery of emerging trends for a pharma company. Then we discuss the benefits provided by this system, as well as typical organizational challenges encountered during the implementation.

Speaker:

, CEO, Megaputer Intelligence Inc.

T5 – A Swiss Army Knife for Many Text Analytics/NLP Tasks?
3:00 p.m. - 3:45 p.m.

The deep learning sequence architecture of the open-domain T5 model from Google gives us an easy way of implementing a number of natural language processing (NLP) tasks with a minimum of training data. As such, T5 is like a Swiss army knife, with multiple blades that can be easily adapted and configured to perform a variety of tasks. This talk describes T5 and demonstrates its use on several analytics tasks such as text classification, question answering, and data augmentation.

Speaker:

, VP Engineering, Voise, Inc.

 

Thursday, Nov 18

Track 2: AI & Rules

 

Semantic Rules & Machine Learning

10:15 AM2021-11-182021-11-18

Thursday, November 18: 10:15 a.m. - 11:00 a.m.

Natural Language & Text Analytics: Limitations of & Alternatives to the Data-Driven & Machine Learning Methods
10:15 a.m. - 11:00 a.m.

Data-driven, statistical, and machine learning (ML) approaches are the currently dominant paradigm in the use of natural language processing (NLP) in text analytics. This talk discusses the limitations of the data-driven approach, particularly in tasks such as sentiment analysis, text filtering, and media monitoring. We argue that these methods can produce results that are, at best, probably, approximately, correct. Moreover, these methods are not scalable, as they require continuous training on massive amounts of data that are often not available. Instead, we argue for a semantic counter-revolution, where deep semantic analysis as well as ontological knowledge repositories are employed. As part of this, a brief description of the semantic method is presented with a discussion of actual use cases in knowledge management and e-discovery.

Speaker:

, Founder and Principal AI Scientist, ontologik.ai

 

Machine Learning & Semantic Rules

11:15 AM2021-11-182021-11-18

Thursday, November 18: 11:15 a.m. - 12:00 p.m.

Deep Learning to Classify Adverse Events From Patient Narratives
11:15 a.m. - 12:00 p.m.
Speakers:

, Principal Solutions Architect, SAS

, Operation Research and Data Analyst, U.S. Food and Drug Administration (FDA)

Co-Located With
  • KMWorld 2021
  • Enterprise Search & Discovery 2021
  • Taxonomy Boot-camp