image
  • Snapboard
  • Activity
  • Reports
  • Campaign
Welcome ,
loadingbar
Loading, Please wait..!!

Machine Learning Systems Intern

  • ... Posted on: Apr 07, 2026
  • ... BrainChip
  • ... Laguna Woods, California
  • ... Salary: Not Available
  • ... Full-time

Machine Learning Systems Intern   

Job Title :

Machine Learning Systems Intern

Job Type :

Full-time

Job Location :

Laguna Woods California United States

Remote :

No

Jobcon Logo Job Description :

Job Description

Hybrid SSM‑Transformer models have a unique advantage for on‑chip memory efficiency:


  • SSM layers compress sequence history into a fixed‑size recurrent state
  • Attention layers store key‑value caches that grow with context length


This leads to an important design question:

For a given model configuration and maximum context length, can on‑chip SRAM be sized so that inference runs entirely on chip—eliminating the need for slower off‑chip HBM or DRAM?


What the intern will work on:


The intern will model and analyze memory behavior during inference of hybrid SSM‑Transformer models, with a focus on avoiding off‑chip memory accesses. Key responsibilities include:


  • Modeling data movement between SRAM and HBM/DRAM during inference
  • Sweeping parameters such as:
  • SRAM capacity
  • Context length
  • Model dimensions
  • Mapping the feasibility boundary where inference can be performed fully on chip
  • Breaking down per‑layer memory working sets
  • Identifying when and why memory spills occur
  • Exploring tiling and scheduling strategies to extend the no‑spill region
  • Validating analytical results through simulation

View Full Description

Jobcon Logo Position Details

Posted:

Apr 07, 2026

Reference Number:

378cfd6476727f15

Employment:

Full-time

Salary:

Not Available

City:

Laguna Woods

Job Origin:

ziprecruiter

Share this job:

  • linkedin

Jobcon Logo
A job sourcing event
In Dallas Fort Worth
Aug 19, 2017 9am-6pm
All job seekers welcome!

Machine Learning Systems Intern    Apply

Click on the below icons to share this job to Linkedin, Twitter!

Job Description

Hybrid SSM‑Transformer models have a unique advantage for on‑chip memory efficiency:


  • SSM layers compress sequence history into a fixed‑size recurrent state
  • Attention layers store key‑value caches that grow with context length


This leads to an important design question:

For a given model configuration and maximum context length, can on‑chip SRAM be sized so that inference runs entirely on chip—eliminating the need for slower off‑chip HBM or DRAM?


What the intern will work on:


The intern will model and analyze memory behavior during inference of hybrid SSM‑Transformer models, with a focus on avoiding off‑chip memory accesses. Key responsibilities include:


  • Modeling data movement between SRAM and HBM/DRAM during inference
  • Sweeping parameters such as:
  • SRAM capacity
  • Context length
  • Model dimensions
  • Mapping the feasibility boundary where inference can be performed fully on chip
  • Breaking down per‑layer memory working sets
  • Identifying when and why memory spills occur
  • Exploring tiling and scheduling strategies to extend the no‑spill region
  • Validating analytical results through simulation

Loading
Please wait..!!