Internship “SMOLL: Scheduling Multi-agent Operations on Lightweight LLMs”

University of Brest

3 to 6 months

35 hours a week

English B2 / French B2

The emergence of lightweight large language models (LLMs) enables inference on edge devices (ranging from embedded systems to end-user devices), unlocking new possibilities for multi-agent systems operating in distributed environments. In multi-agent contexts, such as autonomous fleets, industrial monitoring, or collaborative robotics, each agent may run its own LLM instance, tailored to its hardware and task. Lightweight language models, optimized for low-resource platforms, allow agents to perform local reasoning, reducing reliance on centralized infrastructure and improving responsiveness. However, the variety of LLM models, heterogeneity of edge hardware (CPUs, GPUs, NPUs, etc.) and dynamic workloads introduce complex scheduling challenges. Efficient scheduling must allocate hardware resources and distribute inference tasks across agents and devices, balancing latency, energy consumption, and model accuracy.

Hosting Lab : Lab-STICC (SHARP department / SHAKER team)

SHARP intends to study models, methods and design support tools centered on “architecture” of embedded systems and their environment. In particular, SHAKER is focused on the optimization of such software / hardware systems, through (joint) modeling of their architecture, design of adaptable systems according to environment constraints and application requirements, verification of system behavior and properties, as well as the implementation of tools to aid throughout these processes.

Compensation: 600 euros per month as per national law
Disciplines: Information and Communication Technologies

heterogeneous computing, large language models, distributed systems, characterization, scheduling

Tasks and duties entrusted to the student:

Conducting a state of the art study on lightweight LLMs, edge devices and scheduling challenges and solutions for multi-agent systems
Establishing a systematic characterization methodology to determine the affinity of models to heterogeneous hardware platforms in diverse workload patterns
Running measurement campaigns, profiling relevant models on a variety of hardware platforms
Leveraging these characterization data in a predictive model to (a) guide the integration of future models into the system and (b) efficiently schedule inference tasks in a multi-agent setting.

Skills to be acquired or developed:

Acquire a set of pratical skills in profiling and benchmarking workloads: measurement and estimation of quality of service metrics, energy consumption, generation quality, etc
Acquire an understanding of scheduling techniques, particularly in distributed systems settings
Develop scientific communication skills: the student will report on their progress through weekly presentations
Develop scientific writing skills: the student will contribute to writing a final paper to communicate their results.

Vincent Lannurien (+33) 2 98 01 69 75 / vincent.lannurien@univ-brest.fr