Text Analytics Forum 2017

Nov 8 - 9, 2017 // Washington, DC

Text Analytics Forum 2017 Schedule

Click on a session below to see full descriptions and speakers or view the entire Text Analytics Forum program by Track.

Keynotes
Technical Track
Business & Applications Track

Monday, November 6, 2017

9:00 a.m. - 6:30 p.m.

Optional Workshop & Conference Day

Tuesday, November 7, 2017

8:00 a.m. - 6:00 p.m.

Optional Conference Day

Wednesday, November 8, 2017

8:00 a.m. - 8:30 a.m.

Continental Breakfast

8:30 a.m. - 9:30 a.m.

Keynote - Wow, Woo, Win: KM for Customer Delight
Our popular writers, speakers, and authors of Wow, Woo, Win: Service Design, Strategy & the Art of Customer Delight look at how customer experience and service design can enhance knowledge sharing and success in organizations. They discuss the importance of designing your organization around service and offer clear, practical strategies based on the idea that the design of services is markedly different than manufacturing. When customers have more choices than ever before, study after study reveals that it’s the experience that makes the difference. To provide great experiences that keep customers coming back, organizations or KM programs must design services with as much care as design products. Service design is proactive—it is about delivering on your promise to customers in accordance with your strategy. Our speakers share with you how to create “Aha” moments when the customer makes a positive judgment, and to avoid “Ow” moments. They provide tips on how you and customers create a bank of trust, fueled by knowledge of each other’s skills and preferences.
Tom Stewart, Executive Director, The Ohio State University
Patricia O'Connell, President, Aerten Consulting

9:30 a.m. - 9:45 a.m.

Keynote - Relevance Maturity Model: Revolutionizing With AI-Powered Search
Everyone who engages with your organization is in search of something, whether it’s products, services, people, or support. Too much of their time is spent sifting through useless information. New advances in machine learning and AI technology, combined with contextual search, are finally bringing relevance to every interaction and are making knowledge management a key driver of real business results. See real-world examples of the impact that increased maturity has made on innovative companies. Learn actionable steps to increase the relevance of your organization and start positively impacting your bottom line.

9:45 a.m. - 10:00 a.m.

Keynote - Energizing Communities of Practice
Communities of practice are a great way to develop expertise and innovation around specific interests. By infusing intelligence into many experiences and demonstrating some recent advances in Office 365 you’ll see how to leverage tacit and explicit knowledge in different ways as well as reuse and build upon the work of others. Our speaker has extensive experience in enterprise collaboration systems and currently leads intelligent search and discovery for Microsoft 365. Expect lots of tips & examples for improving your KM initiatives.
Naomi Moneypenny, Director, Product Development, Microsoft Viva, Microsoft

10:00 a.m. - 10:45 a.m.

Coffee Break in the Enterprise Solutions Showcase

10:45 a.m. - 11:00 a.m.

Welcome & Introduction to the Text Analytics Forum
Tom Reamy, Chief Knowledge Architect & Founder, KAPS Group

11:00 a.m. - 11:30 a.m.

A Deep Text Look at Text Analytics
With the recently published book Deep Text: Using Text Analytics to Overcome Information Overload, Get Real Value From Social Media, and Add Big(ger) Text to Big Data as a guide, author Tom Reamy provides an extensive overview of the whole field of text analytics: What is text analytics, how to get started, development best practices, latest applications, and building an enterprise text analytics platform. The talk ends with a look at current and future trends that promise to dramatically enhance our ability to utilize text with new techniques and applications.
Tom Reamy, Chief Knowledge Architect & Founder, KAPS Group

11:45 a.m. - 12:30 p.m.

Text Analytics Market Insights: What’s Working & What’s Next?
Text analytics emerged in the mid-2000s, a collection of technologies, solutions, and practices aimed at meeting a diversity of business challenges. A decade in, what’s new and promising, what’s tried-and-true, and what’s on the horizon? Sentiment, identity, personality, and intent, extracted from text: All are now part of the data science mix. How has the market evolved—both demand and supply—and how should practitioners, solution providers, business analysts, and investors stay on top of developments? Text Analytics Market Insights helps all interested parties understand what’s working and what’s next, to enable them to extract the greatest value from text analytics.
Seth Grimes, Principal Consultant, Alta Plana Corporation

12:30 p.m. - 1:30 p.m.

Attendee Luncheon in the Enterprise Solutions Showcase

1:30 p.m. - 2:15 p.m.

AI & Text Analytics
AI and Transfer Learning: Teaching Computers to Understand Language
Artificial intelligence is transforming text analytics. However, most AI algorithms still lack the ability to understand text data that is different from what is encountered during training. This becomes a critical issue when algorithms encounter unfamiliar words, misspellings, acronyms, or words in a different language. The solution? Transfer learning -- the ability for an AI system to take what it has learned in one situation and apply it to new and different situations. But is transfer learning ready for business deployment, or is it still an emerging technology? How can it be used in text analytics today? Havasi will discuss how transfer learning can be applied to text analytics across multiple industries.
Catherine Havasi, CEO and Co-founder, Luminoso
AI vs. Automation: The Current State of Automated Content Tagging
Some technologies such as IBM Watson are being touted as AI. In response, there are new AI offerings from the large enterprise software companies as well as many startup companies. But is this AI or automation? This talk discusses the difference and argues that these offerings use entity extraction and business rules rather than AI. However, there are real opportunities to use this new technology to automate content tagging.
Joseph Busch, Principal Analyst, Taxonomy Strategies
Case Studies
A New Way of Working Graph & Semantics, Text Analytics, & Linked Data
Using case studies of real-world client projects, Smartlogic’s CEO presents, discusses, and demonstrates how post-relational databases, text analytics, AI, semantics, and linked data are delivering rapid returns on investment in data intensive industries. Cases range from predictive analytics and financial risk assessment to compliance, superior superior customer service, and unified enterprise intelligence within industries including banking, life sciences, media, and healthcare. The talk looks at the technology, the opportunity, lessons learned, and the keys to project success.
Jeremy Bentley, Head, MarkLogic

2:30 p.m. - 3:15 p.m.

Cognitive Computing & Graph Databases
Graph Stores Combined With Text Analytics
Text analytics can discover and add underlying structure to content, providing some remarkable new capabilities for the enterprise. This session focuses on the discovery of relationships between data and the population of graph databases and graph search. There are now more than 30 graph databases on the market, ranging from neo4j to SQL Server 2017, and graph search has become mainstream (including Lucene 6, the Microsoft Graph and many more). Text analytics is important for these to provide value to organizations.
Jeff Fried, Director, InterSystems
NLP & Entity Extractors for Cognitive Computing & Semantic Graph Databases
NLP and entity extractors make up an important part of our use cases in cognitive computing. We discuss how terminology systems and knowledge bases are used in combination with NLP and entity extractors to greatly enrich the contents of our data infrastructures.
Jans Aasman, CEO, Franz Inc.
Search & Text Analytics
Leveraging Text Analytics to Build a Personalized Information Retrieval Environment
To improve the effectiveness of information findability and usability, we are developing a new mechanism to understand users’ interests and predict the information that will be most relevant to their needs. We analyze the technical documents published by members of the workforce and build models that can be used to match user’s requests with the best available content. We utilize an existing hierarchical taxonomy as part of the clustering effort in order to provide preliminary labels for the clusters. The information retrieval environment we are building will not only support retrieval of relevant corporate information upon request, it is designed to proactively notify targeted members of the workforce when relevant information becomes available.
Pengchu Zhang, Computer Researcher, Sandia National Laboratories
Using Text Analytics, Taxonomy, & Search to Probe Ignorance & Risk
In this talk, Patrick Lambe takes an unconventional look at how text analytics, taxonomies and search can be used in concert to probe areas of ignorance, not just uncover and organize what is already known, via three problem cases from the areas of public health and public transport. We demonstrate how elements of the search and discovery technology stack can be used to detect patterns in the environment to address or mitigate these types of problems.
Patrick Lambe, Principal Consultant, Straits Knowledge

3:15 p.m. - 4:00 p.m.

Coffee Break in the Enterprise Solutions Showcase

4:00 p.m. - 5:00 p.m.

Ask the Experts Panel
A panel of four text analytics experts answer questions that have been gathered before the conference, during the conference, and some additional questions from the program chair and sponsors.

Thursday, November 9, 2017

8:00 a.m. - 8:45 a.m.

Continental Breakfast

8:45 a.m. - 9:45 a.m.

Keynote - KM Buy-In: Proven Practices
For a KM initiative to be successful, knowledge managers must secure the support of senior leaders before implementation. Early top management buy-in results in funding, resources, advocacy, usage, broad organizational support, and success— the program yields its expected benefits, KM is spoken of and written about positively by leaders, stakeholders, and users. Hear from our long-time KM practitioner about proven practices illustrated by real-world examples for securing resources, active participation, and ongoing advocacy from top leadership. Get lots of tips for leading an effective, sustainable KM program that is seen as essential to the success of companies in different industries, of different sizes, and with different cultures.
Stan Garfield, Founder, SIKM Leaders Community

9:45 a.m. - 10:00 a.m.

Keynote - Beyond the Box: How Search is Driving Data Access in a Hybrid World
For more than a decade, search technology has been used as the primary access point to the mountains of knowledge and data sitting behind an organization’s firewall. As environments evolve to account for private and public clouds, search is evolving beyond just the box to an API for human information. Will Hayes explores that evolution and talks about how search technologies and professionals play a key role in the enterprise cloud migration strategy.
Will Hayes, CEO, Lucidworks

10:00 a.m. - 10:15 a.m.

Coffee Break

10:15 a.m. - 11:00 a.m.

Machine Learning, Taxonomy, Search
Combining Machine Learning, Text Analytics, & Semantic Web for Automated Tagging
This talk describes work that the IBM Taxonomy Squad has done to develop an enterprise-scale service that automates the extraction of entities and the generation of meaningful metadata. We cover the approach that was taken to design a solution architecture that leverages a corporate knowledgebase and integrates best-of-breed services in taxonomy and ontology management, NLP, machine learning, text annotation, and entity extraction.
Dan Segal, Information Architect, IBM
The Savior Machine: Text Analytics, Machine Learning, & the Role of Taxonomy
The promise of machine learning has become a practical reality in today’s enterprise, but companies often struggle with implementation or reliable results. One fundamental issue is the common “garbage in, garbage out” problem. Poor input stems from the lack of clean data or unclear results from unstructured data analysis feeding machine learning models. Well-built taxonomies powering clear text analytics rules are an important infrastructure need often overlooked in data science activities. Come learn more about the role of taxonomy and text analytics as sources of clean data for machine learning.
Ahren Lehnert, Senior Taxonomist, Genentech
Fake News & Bad Ad Placement
News Analytics System
For modern digital enterprises, the key to survival is held by real-time predictive analytics done with heterogeneous data gathered from multiple sources—layered with contextual intelligence. The data is a mix of structured and unstructured data. Establishing contextual relevance requires systems imbued with deep reasoning capabilities that can link relevant pieces of information from within and outside the organization. This talk presents the outlines of a framework that can gather news events in real time, classify them, reason with them, and finally link them to an enterprise information repository and thereby generate alerts or early warnings for subscribed users. The framework is presented through a number of case studies.
Lipika Dey, Principal Scientist, Tata Consultancy Services
Content Meets Interest - Contextual Ad Targeting by Means of Cognitive Computing
The globally increasing tendency for political populism and media criticism has raised the sensitivity of brands to avoid misplacement of their own campaigns in negative and compromising contexts (bad ads). However, ad targeting is predominantly based on behavioral targeting techniques that heavily rely on (cookie-based) user profiling. The talk showcases a solution for real-time contextual targeting that is exploiting the full power of cognitive computing to match campaigns to online users’ real interests. The approach abandons tracking of any kind of user data and at the same time increases the precision of ad targeting on a real semantic level—beyond what can be achieved with keyword-based methods.
Heiko Beier, CEO, MORESOPHY

11:15 a.m. - 12:00 p.m.

Machine Learning VS. Rules
Automatic Classification: Rules-Based vs. Training-Set-Based Bakeoff
Machine learning techniques can be used effectively for a wide variety of text analysis scenarios, such as reputation monitoring on social media, fraud detection, patent analysis, and e-Discovery. But to apply them well, you need to understand where the limits and pitfalls are in the technology, and you need to understand your data and the problem you are trying to solve. This session outlines an approach that uses text analytics to help understand the characteristics of your data, followed by selection and tuning of linguistic and statistical processing and machine learning parameters to address the application at hand. We highlight three real-world projects that used this approach and show how they worked, what went right and wrong, and how they evolved over time.
Jeff Fried, Director, InterSystems
Text Analytics & Machine Learning
Government agencies face tremendous challenges daily. This includes providing services to ensure a safe, livable environment; making informed spending decisions; and regulating a healthy economy. The data that supports these missions is exploding and is increasingly unstructured. This presentation discusses the application of text analytics and visualizations across a number of these datasets and respective initiatives to provide an actionable view into the data. This involves demystifying techniques including predictive modeling and machine learning in this domain. We show how these techniques can be applied to research analytics, government spending, situational awareness, and assessing consumer financial complaints.
Tom Sabo, Advisory Solutions Architect, SAS Institute Inc.
Case Studies II—Banks & Publishing
Text Analytics & KM
The Inter-American Development Bank (IDB) is the main source of multilateral financing for Latin-America and the Caribbean, and in addition to finance also provides knowledge responses to the Region’s development challenges. In this context, the IDB is constantly working to leverage new technology to improve knowledge management at the IDB in order to support efficiency in its operations and disseminate valuable knowledge and insights for the Region. For this reason, and in order to make all this information more accessible and also to solidify this information’s value to the Bank’s business, we developed a series of proofs of concepts (POCs) that use NLP and ML technologies. The purpose of this presentation is to share reflections gathered during the development of these POCs and the application of these types of approaches within the organization.
Kyle Strand, Lead Knowledge Management Specialist and Head of Library, Inter-American Development Bank (IDB)
Daniela Collaguazo, Text Analytics Consultant, Inter-American Development Bank
Bertha Briceno, Lead Specialist, Knowledge and Learning Sector, Inter-American Development Bank
Machine Learning in Practice
In the last 10 years, most of the academic research on entity extraction and content classification has focused on machine learning and complete automation. The latest tools are very precise, but in academic publishing, the use of automatic classification tools is still controversial. Publishers and information managers want the best of both worlds: a clear list of defined, managed keywords for their content and a cost-effective way of implementing the subject tagging. This presentation reviews the current use of machine-learning tools in publishing, both with and without the use of manually curated taxonomies.
Michael Upshall, Head of Business Development, UNSILO, Denmark

12:00 p.m. - 1:00 p.m.

Luncheon Keynote - Cognitive Search & Analytics: What It Is & Why You Should Care
If you are a believer in the data-driven organization (or even just curious) and have ever wondered what could happen if you cleverly combined the power of data collection, indexing, text mining, search, and machine learning into a unified platform and applied it within the enterprise, this talk is for you! Come learn about the state of cognitive search and analytics technology and how it is enabling great companies across a wide swath of industries to amplify mission-critical expertise within their business in a surprisingly short amount of time. Our speaker illustrates the technology in action with real-world examples.
Scott Parker, Director of Product Marketing, Sinequa

1:00 p.m. - 1:45 p.m.

Auto-Categorization & Summarization
Auto Categorization by Taxonomy: Pros, Cons, & Pragmatics
The terminologies that form taxonomies, thesauri, classification schemes, and name authorities aim to define all concepts unambiguously. These conceptual definitions are, however, primarily written for a human audience and are only partially meaningful to automated categorization processes. This talk explores how automated categorization rules can be synthetically generated by mining the terminology and semantic relationships found in traditional knowledge organization systems. We examine the pros, cons, and limitations of using categorization rules derived from KOS and discuss how they can then be refined and extended using human-curated categorization rules.
Dave Clarke, CEO, Squirro
Search, Semantic Analysis, Text Mining
This talk presents an original approach to processing search results. Rather than showing the usual 10 blue links to webpages, the software creates a text summary of those webpages—a narrative on the topic of the user’s query. The narrative gives the user a quick way to understand the key information on his query. This approach is best applicable to queries that are informational in nature, i.e., those where the user wants to understand a particular subject and get a quick grasp of a concept, an event, a product, or a public figure. The talk focuses on the merits and drawbacks of the approach and comparison with other techniques of presenting the answer to the user’s query.
Dmitri Soubbotin, Founder & CEO, Semantic Engines LLC
Text Analytics & Taxonomy
Bringing It All Together (At Last): Integrating Structured & Unstructured Information With Text Analytics & Ontologies
Organizations are always looking for better ways to integrate their structured (databases and reports) and unstructured (documents and webpages) information. This concept is not new; in fact, it has been the primary information management goal for many years. The difference is that today, the technology to make this happen has matured to the point that this is real. This talk shares real-life examples of how this is done in large repositories using text analytics and ontologies. Session attendees will understand what an ontology is and how it can be merged with text analytics tools to provide better analytics for their data scientists.
Zach Wahl, President & CEO, Enterprise Knowledge
Taxonomies & Text Analytics
This presentation discusses two recent projects where enterprise projects have benefited from direct interactions between taxonomies/ ontologies and text analytics. While these are often seen as competing work streams, our recent work continues to build on the idea that complex information-rich projects require both, and that pursuing one while abandoning the other often leads to poor results or project failure.
Gary Carlson, Founder, Factor

2:00 p.m. - 2:45 p.m.

Text & Data Together
Extracting Content for Linked Data Triples
There is much talk about building triple stores from source content, but most of the models are just that without content to back them up. This session covers a case study of building a triple store to support search and other use cases from nearly 6 million documents. It also looks at the extraction or mining process for pulling 22 types of triple sets for full text and redeploying them for search queries. Lessons learned are also covered.
Marjorie M.K. Hlava, President & Chairman, Access Innovations, Inc.
Text Analytics in the Context of Semantic Datasets & Ontologies
This session addresses the main principles of extracting entities and relationships from unstructured content against ontologies and semantic data sets. We give industry examples of business cases and key components of semantic technology architectures including text analytics and supporting data and metadata governance workflows. Finally, we demonstrate semantic annotation, talking about the challenges organizations face in this regard and some of the important lessons learned in more than 15 years of industry experience
Borislav Popov, Text Analytics and Annotation, Ontotext AD
New Applications
Human-Like Semantic Reasoning
To address the complexity of language ambiguity requires a technology that can read and understand text the way people do. This session explains the concepts behind linguistic analysis, word disambiguation, and semantic reasoning to read and understand content the way people do. It explains the concepts that support a semantic platform, demonstrates a semantic engine, explains how one mobile phone carrier deployed a self-help solution that automatically answered 24,000,000 customer questions annually with 94% precision, and shows a knowledge platform that automatically organizes hundreds of data sources and millions of unstructured documents around multiple corporate taxonomies and entity clusters using dynamically generated metadata in a precise and complete way.
Bryan Bell, Regional Vice President of Sales, Lucidworks
Breaking Down Silos With Text Analytics
The next phase of how we communicate has already started. Popularized by Siri, Alexa, and the like, natural language interaction (NLI) has achieved commercial Q&A success. For organizations looking to adopt new experiences with their customers, NLI holds promise. But there is a big difference between AI applications—the distinction is the degree to which they are intelligent. This talk examines the considerations for enterprise application of NLI and how to avoid applications that just drive more white noise.
Fiona McNeil, Global Technology Product Marketer, SAS

3:00 p.m. - 3:45 p.m.

Measuring The Results
Measuring Auto-Categorization Quality
Auto-categorization is “auto” only in part—there is much in the process that still requires old-fashioned human judgment. One critical step on the human side of the fence is to evaluate the quality of results so refinements can be developed and fed back into the process. But how do you measure quality when human indexers themselves apply topics inconsistently and often differ over applicability of topics? This case study explains how one publisher approached quality assessment in light of human variability and details how classic recall and precision measures were adjusted to provide a user-focused sense of auto-categorization quality.
Larry Lempert, Director, Product Research and Planning, Bloomberg BNA
Information Retrieval Performance Measurement Using Extrapolated Precision
In search, there is often a trade-off between recall and precision, and this impacts any evaluation of approaches: If one system achieves higher recall but lower precision, is it better? Traditionally, this situation has been addressed by using a measure that combines precision and recall into a single number, such as the F1 score. F1 makes strong assumptions about the amount of precision you can trade for a little more recall, and those assumptions are not always appropriate. In some contexts, recall and precision have very different significance. This talk presents a novel performance measure called the extrapolated precision, which avoids making such strong assumptions about allowed trade-offs between precision and recall.
Bill Dimm, Founder & CEO, Hot Neuron LLC
Application Issues
Leveraging Text Analytics to Build Applications
In the world of scholarly publishing (as well as many other industries— such as KM/information conferences!), meeting organizers are inundated with submissions for inclusions in conference programs. Given a large set of submissions, how can we develop tools to cluster submitted manuscripts into tracks based on topical similarity? This talk describes a project that used a subject taxonomy, NLP, and other text analytics tools as well as a large corpus of documents to construct an application to cluster submitted manuscripts based on topical similarity, including a GUI interface to interact with and analyze the results. This is not intended as a detailed technical talk (no slides of code!), nor is it intended as a product spotlight; the focus is on using known/existing text analytics tools to construct purpose-built applications to solve specific document-centric problems.
Bob Kasenchak, Information Architect, Factor, USA
Maximizing Analytic Value From Multi-Language Text Feeds
The universe of text analytics is largely constrained to the output of the entire human race. This can and does result in huge, petabyte- scale problems. Technologies for this scalability, computational distribution, deep learning, resolution, and semantic expression are all new within the last 10 years, and their combination is revolutionary. Key to putting all of this together is that the text analytics are performed in the native language of the original text, prior to the inevitable loss of fidelity in machine or human translation. This talk covers a number of use cases including counterterrorism, knowing your customer, border security, disease tracking and detection, and countering fake news and conspiracy theories.
Christopher Biow, SVP, Global Public Sector, Basis Technology

4:00 p.m. - 4:15 p.m.

Keynote - Creating Unified Views of Data With Semantic Graphs
In recent years, document-centric search over information has been extended with the use of graph-based content and data models. The implementation of semantic knowledge graphs in enterprises is not only improving search in a traditional sense, but opens up a path of integrating all types of data sources in a most agile way. Linked data technologies have matured in recent years and can now be used as the basis for numerous critical tasks in enterprise information management. Hilger discusses how standards-based graph databases can be used for information integration, document classification, data analytics, and information visualization tasks. He shares how a semantic knowledge graph can be used to develop analytics applications on top of enterprise data lakes and illustrates how a large pharmaceutical company makes use of graph-based technologies to gain new insights into its research work from unified views and semantic search over heterogeneous data sources.
Joseph Hilger, COO, Enterprise Knowledge

4:15 p.m. - 5:00 p.m.

Closing Keynote - KM in the Age of Digital Transformation: Magic Sauce for a Successful Future
At the cross-section of innovation, open data, and education, our speaker, a former government KM practitioner, shares her thoughts about the challenges and opportunities for organizations and communities in the coming years. She discusses empowering members of our communities and improving services using new tech like AI, machine learning, virtual and augmented reality, Internet of Things, predictive analytics, gamification, and more. Are we moving toward anticipatory knowledge delivery (just enough, just in time, just for me), being in the flow of work at the teachable moment, establishing trust in a virtual environment, and learning from peer-to-peer marketplaces like Airbnb and Uber? Our longtime KM practitioner shares her insights about the evolving digital transformation of every part of our world and hints at the magic sauce we need for a successful future!
Jeanne Holm, Senior Technology Advisor to the Mayor, Deputy CIO at City of Los Angeles, City of Los Angeles

Text Analytics Forum 2017 Schedule

Co-Located With

Diamond Sponsors

Gold Sponsor

Tuesday Networking Reception Sponsor

Media Sponsors