Northwestern University Libraries

2026 Technical Refresh

Model of Models has been completely redesigned from the ground up. Every layer of the platform -- from the compute infrastructure to the training pipeline to the interface itself -- was re-engineered for speed, reliability, and a markedly better research experience.

A New Pipeline Architecture

The legacy monolithic processing pipeline has been replaced by a fully event-driven, microservices architecture. Every model training request is automatically routed through a three-tier compute decision engine:

  • Serverless Express Path -- Small jobs (500 documents or fewer) execute in a lightweight serverless function with zero cold-start overhead, returning results in seconds rather than minutes.
  • GPU Compute -- GPU-accelerated methods are dispatched to NVIDIA hardware equipped with dedicated video memory, unlocking hardware-accelerated training for embedding and transformer-based models.
  • CPU Compute -- General-purpose workloads run on high-performance multi-core processors with 16 GiB of memory, optimized for topic modeling and large-corpus analytics.

The routing is automatic and invisible to the researcher. Submit a query, and the platform selects the fastest available path.

GPU-Accelerated Visualizations

Four visualization methods have been optimized for accelerated compute -- three leveraging NVIDIA GPU cores and one using parallel multi-core training -- yielding substantial reductions in time-to-result:

Word2Vec GPU

Word embedding models train directly on GPU tensor cores, accelerating vector space construction for large vocabularies.

Doc2Vec GPU

Document-level embeddings leverage GPU parallelism to produce distributed representations of entire documents, enabling faster similarity analysis.

BERTopic GPU

Transformer-based topic modeling using sentence embeddings and GPU-native UMAP dimensionality reduction. The most computationally intensive method, now running on dedicated NVIDIA hardware.

Multilevel LDA 3x Faster

The flagship topic modeling visualization now trains six independent LDA models in parallel across multiple CPU cores simultaneously, reducing wait times by approximately three times.

Real-Time Status and Observability

The My Models dashboard now provides granular, real-time status updates throughout every phase of a job's lifecycle:

  • Batch state labels -- see exactly when a worker is booting from the warm pool, starting up, or initializing the pipeline, instead of a static "Scheduled" indicator.
  • Live in-place updates -- status changes, progress bars, and elapsed timers all update seamlessly without reloading the page. When a job completes, the card transitions to its finished state in place.
  • Three-second polling -- the dashboard checks for new status every three seconds, providing near-instant feedback on pipeline progress.
  • GPU memory monitoring -- if a GPU job exhausts its available video memory, the system detects the failure, logs diagnostic telemetry, and automatically retries the job on CPU.

Interface Redesign

The entire user interface has been refreshed with a modern design language. Glassmorphic surfaces, refined typography, and a consistent Northwestern visual identity replace the previous layout. A new theme toggle lets users switch between the refreshed look and the classic interface at any time.

Help Us Make This Better

Model of Models is under active development. If you encounter a problem or have an idea for improvement, we would like to hear from you.

Report a Bug or Suggest a Feature