Dynamic Understanding and Generative AI: Revolutionizing Digital Categorization

‍

The first use cases we will dive into are related to dynamically tagging any item or product.. Most businesses are still trapped in a world of static product categories and fixed attributes. This approach made sense in the era of physical catalogs and limited computing power. However, in today's world of online catalogs and vastly increased computing power, it's ineffective.

It's not just outdated—it's a massive missed opportunity.

Let's start with a contrarian truth: The most valuable information about your products isn't in your catalog—it's in the minds of your users.

The Limitations of Static Understanding

Five years ago, the digital landscape was dominated by static categorization methods, a remnant of the physical catalog era. Whether in e-commerce, music streaming, or content platforms, the approach was largely the same:

Predefined category hierarchies
Manual tagging of products with fixed attributes
Occasional batch updates to reflect major changes

This approach, while familiar, has critical weaknesses:

Slow Adaptation: By the time a new category is manually added, the trend might already be fading. In the fast-paced world of music, for instance, new genres can emerge and evolve rapidly, outpacing traditional categorization methods.
Lack of Nuance: It fails to capture the nuanced ways different users perceive items. A "minimalist" desk lamp to one user might be "modern art" to another. Similarly, a song could be "relaxing" to one listener and "energizing" to another.
Struggles with Novelty: It breaks down for novel or niche items that don't fit neatly into existing categories. This is particularly challenging in areas like indie music or niche product markets.
Resource Intensive: It's incredibly time-consuming and error-prone, especially for large catalogs.
Inconsistency and Error-Proneness: Human-driven categorization inevitably leads to inconsistencies and errors, especially when dealing with large catalogs. Different team members might interpret categorization guidelines differently, leading to a lack of uniformity across the catalog.

‍

Consider two contrasting examples that highlight these limitations:

Wayfair's E-commerce Challenge: The online furniture retailer faced an insurmountable task with over 14 million items in their catalog. They estimated it would take 29 years for a team of 5 people to manually categorize their entire catalog – clearly unsustainable in a fast-moving market.

‍

A picture of a yellow couch with accent pillows and a coffee table can be used to extract product tags, like "square arm", "sofa", "vintage modern" style, etc. for the couch, and "table", "round", "black", "4 legs" for the coffee table.

‍

Pandora's Music Genome Project: In stark contrast, Pandora took an early lead in moving beyond static categorization. They invested heavily in the Music Genome Project, a comprehensive effort to analyze and categorize music using up to 450 distinct musical characteristics per song. However, this process was still largely manual and extremely resource-intensive. According to a 2019 report, Pandora had spent over 20 years and millions of dollars on this project, with a team of professional musicians tagging songs at a rate of one song per 20-30 minutes.

‍

Modeling Genre with the Music Genome Project: Comparing Human-Labeled Attributes and Audio Features

‍

While Pandora's approach was more nuanced than traditional categorization, it still faced limitations:

The manual nature of the process made it difficult to keep up with the rate of new music releases.
The fixed set of attributes, while extensive, couldn't capture all the subjective ways listeners experience music.
The system struggled to adapt to rapidly evolving musical genres and cross-genre fusion.

These examples illustrate a fundamental problem: static understanding, even when highly detailed, fails to fully capture how humans think about products, music, or content. It assumes a one-size-fits-all approach to categorization, ignoring the personal, contextual, and ever-changing nature of human perception.

‍

But the biggest problem? It fundamentally misunderstands how humans think about products.

‍

The Promise of Dynamic Understanding

Dynamic tagging and fluid product understanding offer a radically different approach:

Real-time extraction of product attributes from descriptions, images, and user behavior
Continuous updating of product representations based on user interactions
Personalized product understanding that adapts to individual user perspectives

Now what does this really mean to be Dynamic?

At its core, dynamic understanding is about creating systems that learn, adapt, and evolve in real-time. Instead of rigid categories and fixed attributes, we're talking about fluid, ever-changing representations of products and content. Here's what this looks like in practice:

‍

Real-time Learning and Adaptation

Every time a user interacts with a product – whether it's a click, a purchase, or even just hovering over an image – the system gains new insights. This isn't just about collecting data; it's about understanding context and intent. This relies mainly on what we call "collaborative filtering", exept that people beleive this to mainly impact the users and forget too often the impact on the product.

‍
For example, let's say you're shopping for a lamp. In a static system, that lamp might be categorized simply as "lighting" or "home decor." But a dynamic system goes much deeper. It notices that users who buy this lamp often also purchase plants, natural fiber rugs, and books on mindfulness. Suddenly, that lamp isn't just a light source – it's part of a broader lifestyle category that might be called "eco-conscious living" or "mindful home design."
‍

Personalized Perspectives

One of the most powerful aspects of dynamic understanding is its ability to adapt to individual users. The same product can mean different things to different people, and dynamic systems recognize this.

‍
Take a pair of running shoes. For a marathon runner, these might be categorized as "high-performance gear." For someone just starting a fitness journey, they might fall under "beginner-friendly exercise equipment." A dynamic system can present the same product in different contexts, tailoring the experience to each user's needs and interests.

‍

Deep Product Tagging - The Next Gen Visual Search for eCommerce

‍

Cross-Modal Insights

Dynamic understanding isn't limited to just text or just images. It combines insights from various sources – product descriptions, user reviews, images, videos, and even usage data – to create a rich, multi-dimensional understanding of each item.

‍
This cross-modal approach allows for some pretty remarkable capabilities. A system might "see" that a shirt has a certain pattern, "read" reviews mentioning its comfort, and "learn" from purchase data that it's popular for casual office wear. All these insights combine to create a nuanced understanding that goes far beyond simple categories like "men's shirts" or "casual wear."
‍

Continuous Evolution

Perhaps the most important aspect of dynamic understanding is that it's never "finished." These systems are designed to continuously evolve, adapting to new trends, changing user behaviors, and emerging categories.
This is crucial in a world where new product categories can emerge overnight. Think about how quickly "smart home devices" went from a niche category to a major market segment. Dynamic systems can identify and adapt to these shifts in real-time, without needing manual updates or recategorization.
‍

Let's look at some real-world applications:

Pinterest uses computer vision and natural language processing to automatically tag and categorize the billions of "pins" on its platform. This enables much more nuanced and accurate content discovery.

Amazon's "Amazon Scout" uses AI to understand visual attributes of products, allowing users to refine searches based on style preferences that are hard to describe in words.

‍

Technical Considerations

Achieving truly dynamic product understanding requires integrating several now common AI models and technologies:

Computer Vision: Deep learning models like Convolutional Neural Networks (CNNs) can extract visual attributes from product images with superhuman accuracy. Google's Cloud Vision API, for instance, can detect dominant colors, identify objects, and even read text within images. Obviously this would imply a robust, clean and formatted catalog and database of images for this to be effectively leveraged.
Natural Language Processing: Transformer-based models like BERT can parse product descriptions and user reviews to extract key attributes and sentiments. This allows for understanding of product features that aren't explicitly tagged.
Graph Neural Networks: These models can capture complex relationships between products, attributes, and users. Pinterest's PinSage algorithm, for example, uses graph convolutional networks to generate embeddings for pins based on their connections to other pins and boards.
Reinforcement Learning: RL algorithms can continuously optimize tagging strategies based on user interactions. This allows the system to learn which attributes are most important for user decision-making.

But the real magic happens when we combine these with generative AI, particularly transformers and large language models.

A Gen AI Advantage

Gen AI, especially LLMs, brings several game-changing capabilities to product understanding:

Zero-shot classification: Categorizing products into arbitrary taxonomies without specific training - this is basic and not recommended, but we will cover more of this topic in chapter 8
Dynamic attribute generation: Creating new, relevant attributes on the fly based on context
Cross-lingual understanding: Unifying product representations across languages and cultures
Semantic search: Understanding the intent behind user queries, not just matching keywords

Imagine a system that can instantly understand a new product, generate relevant attributes, and place it in the right context for each individual user—all without any manual intervention.

GENERATIVE AI FOR MERCHANDISERS

The Strategic Implications

Now the main question: Why does this matter?
For business leaders, the shift to dynamic product understanding represents some serious opportunities:

Improved discoverability: Helping users find exactly what they're looking for, even if they don't know how to describe it
Better recommendations: Understanding products at a deeper level allows for more nuanced, relevant suggestions
Adaptive user experiences: Tailoring how products are presented based on individual user preferences and contexts
Operational efficiency: Reducing the manual labor involved in catalog management and updates

But it also comes with challenges:

Technical complexity: Building these systems requires significant AI and engineering expertise
Data quality and scale: Dynamic understanding thrives on large amounts of high-quality data
Operative Costs: calling an LLM in real time and across thousands and thousands of products or content becomes quickly costly and hard to maintain.

Let’s add to all this another non-obvious insight:

Dynamic product understanding isn't just about better categorization or search. It's about creating a shared language between business and users.

When considering static categories, there's often a mismatch between how businesses think about their products and how users think about their needs. Dynamic understanding has then a unique opportunity to bridge this gap, creating a fluid, evolving and quasi personal adaptive interface between business offerings and user intent.

Looking Ahead

As we look to the future, we see a convergence of dynamic understanding and generative capabilities. Imagine systems that can:

Generate entirely new products based on emerging user needs and preferences
Create personalized product descriptions and visualizations for each user
Engage in natural language dialogue about products, truly understanding and responding to nuanced queries

This convergence promises to blur the lines between product discovery, creation, and personalization.

For enterprise leaders, the key is to start experimenting now. Here are some steps to consider:

Audit your current product categorization and tagging processes. Identify bottlenecks and areas where AI could add immediate value.
Invest in AI capabilities, particularly in computer vision and NLP. Consider partnering with AI-focused startups or leveraging cloud services from major providers like Google, Amazon, or Microsoft.
Start small: Choose a subset of your catalog such as a sub-category to experiment with dynamic tagging. Use this as a learning opportunity before scaling.
Rethink your data strategy: Ensure you're collecting the right data to fuel dynamic understanding systems. This might include more detailed user interaction data or richer product metadata.
Foster cross-functional collaboration: Dynamic product understanding touches everything from inventory management to marketing. Ensure all stakeholders are involved in the transition.

In the next chapter, we'll explore how these dynamic product understanding capabilities can be leveraged to create personalized content, taking us beyond curation to true content creation.

Chapter 3Dynamic Tagging and Fluid Product Understanding.