Powering Text-to-Image AI with Scalable Human Preference Data

The Challenge

Building world-class text-to-image generative AI isn’t just about training models on large datasets—it’s about fine-tuning them with nuanced human judgment. For this leading GenAI app, that meant finding a way to gather high-quality preference data without slowing down their product team. But sourcing and labeling that data in-house would have required a huge lift draining time and energy from their fast-moving development cycles.

The Approach

Instead of overloading their internal teams, the company chose to work with Databrewery. Through Databrewery’s platform and Brewforce-powered Labeling Services, they were able to quickly spin up a scalable, human-in-the-loop pipeline. Skilled annotators and experienced project leads came together to handle the complex task of evaluating model outputs based on subtle user preferences freeing the product team to stay focused on innovation.

The Outcome

With Databrewery in place, the team doubled the speed of their human preference data generation by 2x. What once took months could now be done in weeks. More importantly, their AI models got better, faster thanks to high-quality, structured human input that scaled with their product ambitions.

Text to Image

A fast-growing generative AI company focused on creating visual content such as logos, posters, and imagery was facing a familiar challenge: how to improve model performance without overwhelming their product team. Their technology allows users to turn text prompts into vivid images, and while the core product was gaining momentum, training the models to better reflect user preferences required vast amounts of human feedback. Building this dataset internally would have diverted critical resources, slowed product releases, and pulled focus from their core mission of delivering intuitive, creativity-first tools.

To keep moving fast without compromising on quality, the team turned to Databrewery. With Databrewery’s specialized tools for collecting and ranking model outputs, and its Brewforce workforce providing expert human input, the company was able to implement a more efficient approach to training data creation. Rather than relying on internal resources to compare and classify outputs, they used Databrewery’s purpose-built LLM preference editor to streamline reinforcement learning from human feedback (RLHF), an approach that allows models to learn which outputs real users find most useful or visually appealing.

Generate Conversation

What set this partnership apart was the ability to incorporate human judgment at scale, especially across outputs generated from multiple models. Using Databrewery’s latest multimodal chat tools, the team could simulate conversations, evaluate how different models performed on a variety of prompts, and collect structured rankings based on visual relevance, aesthetic appeal, and fidelity to input. This gave them a clear, repeatable way to measure model quality week over week not just in isolation, but against competitors and previous versions of their own systems.

The result was a self-improving feedback loop: humans evaluated outputs, the data informed new training runs, and the updated models were quickly tested again. This cycle helped the company pinpoint weak spots, prioritize where to focus next, and make meaningful progress with each iteration. The company saw more than a 50% increase in data generation speed, and development timelines that previously spanned months were shortened to weeks all without sacrificing quality or overburdening internal teams.

Looking ahead, they plan to continue scaling with Databrewery and Brewforce, expanding their access to highly skilled labelers with strong visual literacy and domain expertise. This will help them stay ahead of rising demand, while continuing to improve how users interact with and get value from generative AI.