Demand reality
Normal gateway use is materially lower than the 5,000-session engineering proof target.
A shareable view of the core AVPN assumptions: 300,000 registrations are programme-period scale, 5,000 concurrent sessions are a readiness ceiling, and the retainer works only with explicit usage caps and campaign approval rules.
Design and test for 5,000 sessions, but run ordinary months lean.
Monthly active usage and gateway minutes drive real concurrency.
Campaigns, grounding, and stronger models need explicit approval.
These choices make the retainer governable through 2029.
Baseline scenario selected.
Decision-safe baselineThis print view expands every tab into one decision pack: executive readout, demand reality, cost assumptions, AI/language controls, LTP operations, M&E, delivery roadmap, risks, decisions, and source caveats.
Normal gateway use is materially lower than the 5,000-session engineering proof target.
Modelled cloud and AI against the $950 infra + AI envelope.
The platform should be sized around an elastic baseline and load-tested against the 5,000-session readiness target. Campaign peaks and higher AI usage should be pre-approved through clear caps, while monthly dashboards make cost, language quality, and LTP verification visible to AVPN.
The answer is not "more servers." It is a governed operating model with transparent assumptions, visible thresholds, and pre-agreed decisions.
The concurrency estimate is driven by active learners, gateway minutes, and concentration.
Registrations across the programme period.
Used for planning pace.
Not the same as simultaneous users.
Scenario-controlled monthly active learners.
Time spent in the AVPN gateway, not external course time.
Computed scenario peak.
Average concurrent = MAU x gateway minutes / active minutes per month. Peak concurrent = average concurrent x peak concentration factor.
This is the mental model that should replace "300k equals 5k always online."
| Scenario | Peak sessions | Interpretation |
|---|---|---|
| Normal operations | ~130 | Lean baseline planning. |
| Peak campaign month | ~1,406 | Pre-warm, monitor, and cap usage. |
| 300k target spread month | ~2,500 | High but still below 5k. |
| Synchronized deadline surge | ~5,625 | Stress envelope, not retainer baseline. |
| Formal proof target | 5,000 | Load-test and document readiness. |
| Trigger | Why it matters | Commercial response |
|---|---|---|
| All learners pushed through one synchronized deadline | Peak factor rises sharply. | Temporary campaign capacity window. |
| Gateway becomes full LMS hosting | Session time and storage increase. | Separate architecture and cost model. |
| Always-on 5k reserve requested | Idle capacity becomes the cost driver. | AVPN-funded guaranteed-capacity option. |
| External course providers return inconsistent data | Verification queues and support load rise. | LTP schema governance and change-order rules. |
These lines must stay distinct in client conversations.
| Bucket | Amount | Meaning |
|---|---|---|
| Development | $154,000 one-time | Build phase, June to November 2026. |
| Maintenance retainer | $3,000/month | Operating retainer across the maintenance period. |
| GCP infrastructure sub-line | $700/month | Baseline cloud platform allowance. |
| Vertex/Gemini AI sub-line | $250/month | Capped AI usage allowance, separate from dev-hours. |
| Other tool/API sub-lines | $450/month | Email, video testimonial, survey, translation, and related tools. |
The retainer covers agreed operations within approved assumptions. It is not unlimited cloud, AI, translation, email, video, survey, or feature scope.
Only cloud and AI are modelled here; other tool/API lines remain separate.
Against $700/month infra line.
Against $250/month AI line.
Cloud + AI + fixed tool/API lines.
Of $700 allowance.
Of $250 allowance.
Labour + vendor/tool lines.
These variables drive the AI line and reveal when a change order is needed.
MAU expected to use the assistant.
Monthly text conversations priced by model.
Used for grounding and operational load.
Input/output token spend before grounding.
Search/grounding uplift after free allowance.
Shows whether the selected scenario fits the $3,000/month retainer.
Submitted maintenance retainer.
Calculated all-in maintenance line.
Positive means remaining headroom.
Submitted maintenance total.
If this scenario held every month.
Requires approval if above $0.
The visible formula layer behind the monthly model.
| Assumption | Value used | Why it matters |
|---|---|---|
| Chatbot usage volume | MAU x adoption x sessions/user | Conversations, not registrations, drive the text-token AI line. |
| Conversation token profile | 10,000 input tokens + 2,000 output tokens | Text-only planning assumption. Halve or double costs if real conversations are half or 2x this size. |
| Default volume route | Gemini 3.1 Flash-Lite: $0.25 / 1M input + $1.50 / 1M output | Recommended default for most production learner-support conversations. |
| Escalation route | Gemini 3.5 Flash: $1.50 / 1M input + $9.00 / 1M output | Use when ambiguity, risk, confidence, or quality requires it. |
| Routed policy | 85% Gemini 3.1 Flash-Lite + 15% Gemini 3.5 Flash | Near-premium UX without paying premium pricing on every conversation. |
| Grounding/search uplift | First 5,000 grounded prompts free, then $14 / 1,000 prompts | External grounding is the fastest way to break the $250 AI line. |
| Cloud formula | $430 base + peak/MAU uplift above planning thresholds | Separates elastic baseline from campaign or stress capacity. |
One-time build cost by workstream, separated from maintenance.
| Workstream | Budget | Client meaning |
|---|
Converts the selected monthly model into a full programme view.
$154k build + $96k maintenance.
Build + selected monthly model x 32.
Positive means remaining programme headroom.
Modelled TCO over 300,000 registrations.
Selected maintenance model over MAU.
AI line divided by text conversations.
Monthly and 32-month values are calculated from the selected scenario.
| Line | Budget/mo | Model/mo | Delta/mo | 32-month model | Status |
|---|
| Scenario | Peak | Cloud | AI | Maint./mo | Status |
|---|
| Route | AI/mo | Delta vs $250 | Read |
|---|
Text-only estimate using 10,000 input + 2,000 output tokens per conversation.
| Model | Cost / conversation | 1k convos | 5k convos | 10k convos | Strategic read |
|---|
Useful for offline reporting, classification, and bulk processing; not the default for live learner chat.
| Model | 1k convos | 5k convos | 10k convos | Use when |
|---|
| Option | 1k | 5k | 10k | Read |
|---|
These rows translate usage choices into monthly overage and decision language.
| Decision trigger | Added cost/mo | If 32 months | Client decision |
|---|
| Inside retainer when capped | Separately approved or pass-through |
|---|---|
| Business-hours support, QBR/reporting, 5 dev-hours, dependency/security checks. | 24/7 coverage, new feature scope, major integrations, or additional dev capacity. |
| Elastic baseline cloud and ordinary storage/monitoring within the $700 infra line. | Always-on 5k reserve, dedicated cluster, campaign pre-warm, or load-test windows. |
| Routed AI support within the $250 AI line and approved fallback behaviour. | High-quality model everywhere, broad web grounding, unusually high chatbot adoption, or no per-session cap. |
| Standard email, video testimonial, survey, and translation tool allowances. | Large notification blasts, heavy media usage, new survey tools, or expanded translation volumes. |
| Common LTP upload schema, validation, queue ageing, and agreed reporting. | Provider-specific parser logic, custom LTP workflows, or remediation of poor source data. |
| Mode | Estimate/month | Status |
|---|---|---|
| Elastic baseline: Cloud Run + managed services | $588 | Fits $700 |
| GKE Autopilot baseline | $677 | Tight |
| Pre-warmed campaign month | $1,708 | Approve separately |
| 5k load-test / stress month | $4,668 | Not steady state |
| Dedicated 10-node Kubernetes equivalent | $2,903 | Different commercial model |
Same traffic, different model policies.
Model routing keeps high-volume support away from expensive defaults while preserving escalation quality.
| Layer | Default behaviour | Control |
|---|---|---|
| FAQ first | Approved catalogue and platform guidance answer common questions. | No token spend where static support is enough. |
| Low-cost route | Bounded navigation and discovery support. | Per-session and monthly caps. |
| Escalation route | Ambiguous, sensitive, or quality-critical cases. | Stronger model, sampled review, budget visibility. |
| Guided help mode | If caps are approached, assistant degrades gracefully. | Learners are not shown blunt budget failure language. |
Lower-resource languages need review queues and explicit fallback rules.
The 25-30 LTP challenge is workflow governance, not mainly compute.
| Cadence | Surface | Decision it supports |
|---|---|---|
| Weekly | Upload success, rejected rows, schema exceptions, queue age. | Which LTP needs support before reporting quality degrades. |
| Monthly | Completions, certificates, unresolved verification cases, LTP responsiveness. | Which delivery partners create programme risk. |
| Campaign window | Daily throughput, peak queue depth, error types, support tickets. | Whether to extend campaign capacity or intervene operationally. |
| Quarterly | Data quality trend and schema-change requests. | Whether the common template needs governance updates. |
These are the client-facing reporting modules the platform should make visible after launch.
| Module | Questions answered | Core fields |
|---|---|---|
| Learner funnel | Where do learners drop off? | Registration, recommendation, click-through, start, completion, certificate, survey. |
| Programme reach | Who is being reached across markets? | Country, cohort, language, approved demographic fields, LTP, course. |
| LTP delivery | Which partners are operationally healthy? | Uploads, completions, rejected rows, verification ageing, support exposure. |
| AI and language quality | Where is the support layer risky? | Usage, escalation, fallback, feedback, language tier, issue ageing. |
| Finance guardrails | Are costs inside the approved envelope? | Cloud, AI, translation, email, survey, campaign uplift, approval status. |
This is not just a dashboard. It is the governance layer that makes the fixed retainer, LTP network, and multilingual AI support manageable.
A five-month build only stays credible if each phase has a visible proof gate and a named decision it resolves.
Lock scope, billing stance, AI policy, languages, and LTP data rules.
Implement learner registration, recommendation/routing, identity, and admin foundations.
Validate uploads, completion evidence, certificates, queue ageing, and exception handling.
Operationalize language QA, cost dashboards, graceful fallback, and M&E exports.
Run load proof, UAT, support rehearsal, runbook review, and go-live decision gate.
| Gate | Pass condition | Owner decision |
|---|---|---|
| Cost gate | Infra, AI, translation, email and survey usage have caps and reporting. | Approve baseline vs pass-through model. |
| Scale gate | Load test documents 5,000-session envelope and campaign pre-warm runbook. | Approve launch/campaign window. |
| Language gate | Tiered language QA, fallback, feedback and incident review are live. | Approve production language list. |
| LTP gate | At least one pilot upload proves validation, queue ageing and audit trail. | Approve common upload schema. |
| Handover gate | Runbooks, ADRs, backups, org-owned access and support routes are reviewed. | Approve maintenance transition. |
| Risk | Impact | Mitigation |
|---|---|---|
| Retainer interpreted as unlimited usage | Commercial | Hard caps, assumptions table, approval workflow. |
| 5k treated as steady-state capacity | Cost | Load-test ceiling plus campaign pre-warm option. |
| Low-resource language answer is wrong | Trust | FAQ grounding, feedback, review sampling, fallback and incident process. |
| LTP CSVs are inconsistent | Operations | Common template, validation, on-hold queue, audit trail. |
| One-person technical dependency | Continuity | Runbooks, ADRs, code review, org-owned access, named backup coverage. |
| Exact pricing drifts before submission | Evidence | Refresh vendor pricing before final commercial quote. |
Exact prices and product capabilities should be refreshed before a final commercial submission.
This dashboard is a planning and decision-support artifact. Final contractual numbers should be reconciled against approved scope, region, billing ownership, selected AI model policy, campaign assumptions, and live vendor pricing.