JOBIM J-Factor Utility
Optimized for latency and cost. Ideal for large-scale integration.
Model Overview
Base Model
Proprietary K=64 Architecture
Endpoint
/v1/models/jobim-jfactor-utilityThe J-Factor Advantage
JOBIM J-Factor is not quantization — it's a deep re-architecture of weight storage and activation. We deliver the same response quality as leading models with 98.2% less VRAM and13.6 tokens/sec throughput.
98.2%
Compression
13.6 TPS
Throughput
< 2GB
Inference VRAM
Recommended Use Cases
- CheckmarkCustomer support chatbots
- CheckmarkHigh-volume content generation
- CheckmarkReal-time summarization
- CheckmarkEntity extraction & classification
- CheckmarkLow-latency API endpoints
API Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | string | — | Use jobim-jfactor-utility |
| messages | array | — | List of message objects |
| max_tokens | integer | 2048 | Max 128K context |
| temperature | number | 0.7 | 0.0 – 2.0 |
| stream | boolean | false | Enable real-time streaming |
Quick Start Code
JavaScript / TypeScript
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.JOBIM_API_KEY,
baseURL: 'https://api.jobim.ai/v1',
});
const completion = await openai.chat.completions.create({
model: 'jobim-jfactor-utility',
messages: [
{ role: 'user', content: 'Explain the Pareto Principle in one sentence.' }
],
temperature: 0.1,
});