Mereka x AVPN - operating model dashboard

AVPN operating model for scale, cost, AI support, LTP delivery, and M&E.

Planning dashboard for scale, cost and operating controls: 300,000 registrations are programme-period scale, 5,000 concurrent sessions are a readiness proof target, and the retainer stays governable through explicit usage caps and campaign approval rules.

Product frameGateway + M&E layer

Readiness proof5,000 sessions

Maintenance model$3,000/month

Operating ruleCapped usage

Recommended decision Operate an elastic baseline and validate 5,000-session readiness through load testing.

Design for readiness, but avoid funding idle peak capacity every month.

Evidence 300,000 is programme-period scale.

Monthly active usage and gateway minutes drive real concurrency.

Cost boundary $700 infra + $250 AI are capped assumptions.

Campaigns, grounding, and stronger models need explicit approval.

Decision ask Confirm caps, campaign windows, and LTP data rules.

These choices keep the retainer transparent through 2029.

How to read this dashboard

This is a planning model, not a production quote. It shows planning scenarios and approval triggers; it does not change the submitted scope, price, support model or billing structure. Lower estimated usage does not create a credit or additional scope. Higher estimated usage identifies where AVPN and Mereka should pre-approve campaign capacity, AI usage, support coverage or new scope.

Cost outputs depend on final GCP region, billing ownership, Cloud Run/GKE configuration, Cloud SQL tier, storage, egress, logging policy, AI model route, Google Search/Maps grounding policy, campaign calendar and live vendor pricing. The submitted commercial model remains the Mereka-managed, capped, single-invoice retainer unless AVPN and Mereka agree a different billing structure in writing.

Jump to section

Monthly active 25,000

Gateway time 10 min

Peak factor 5x

AI route Routed

Baseline scenario selected.

Baseline ready

AVPN Operating Model Decision Pack

This print view expands every section into one decision pack: executive readout, decision structure, due diligence, demand model, cost model, AI safeguards, LTP operations, reporting, delivery gates, risk register, and evidence references.

Monthly registration pace

8,108

300,000 learners spread over 37 programme months.

Projected peak sessions

Far below the readiness proof target.

Scenario cloud + AI estimate

$477

Cloud + AI within planning assumptions.

Decision read

Lean baseline

Default operating mode for ordinary months.

Demand reality

Normal gateway use is materially lower than the 5,000-session engineering proof target.

Baseline-ready

Executive answer

The dashboard is organised around AVPN's core due-diligence questions.

Can the platform scale?Use an elastic baseline and validate the 5,000 concurrent-session readiness target through load testing and campaign pre-approval rules.

How is the cost model made governable?The $700 infra and $250 AI allowances are tracked separately, with live calculator validation before production commitments harden.

Can AI be safe enough?Yes, if common support is FAQ-first and lower-cost, while ambiguous or higher-risk cases escalate to stronger models or humans.

Where is the real operating risk?LTP data quality, verification ageing, language QA, and maintenance boundaries are larger risks than raw server capacity.

Budget guardrail

Cloud and AI are tracked against separate monthly allowances: $700 for GCP infrastructure and $250 for AI usage.

GCP infra allowance used61%

AI allowance used19%

Operating rules

Elastic baselineDefault month stays lean; campaign capacity is planned and time-bound.

Hard capsMonthly, per-session, and per-language AI limits prevent surprise spend.

Common schemasLTP uploads need templates, validation, queues, and change-order boundaries.

Sendable position

Size the platform around an elastic baseline, load-test against the 5,000-session readiness target, and govern campaign peaks through approved capacity and AI limits.

Decision read

The operating model makes capacity, cost, language quality, and LTP verification visible through thresholds, evidence, and monthly reporting.

Decision structure

How the operating model answers AVPN's approval questions and connects each answer to evidence, controls, and decision boundaries.

Decision structure

Clarification area	Evolved answer	Evidence in this dashboard	Decision boundary
AI/API usage assumptions	AI is a bounded support layer. Deterministic and retrieval-scoped flows come first; model calls are routed by risk and need.	Cost controls, model price comparison, grounding/voice/self-host economics.	AVPN approves monthly caps, model route, grounding policy, and campaign overrides.
Architecture trade-off	Custom AVPN product layer on proven foundations. One canonical system of record; adapters and workflow jobs do not own programme truth.	Delivery gates, LTP workflow controls, risk register, source-of-truth language.	Month 1 confirms architecture decision record, identity route, workflow pattern, and handover docs.
UX when AI budget is constrained	Learners should see guided help mode, FAQ-first flows, recommendation cards, and human handoff.	AI support policy, cost escalation triggers, operating rules.	Approve fallback copy, priority journeys, and admin threshold alerts.
Lower-resource-language quality	Language quality is managed operationally: approved content, tiered QA, feedback, sampling, incident review, and fallback.	Language risk matrix, quality controls, monthly reporting promise.	Approve production language list and review coverage before launch.
5,000 simultaneous sessions	Design and load-test for 5,000 as a readiness proof target. Permanent peak capacity is a separate reserved-capacity commercial model.	Demand ladder, infra mode choices, 5,000 proof scenario.	Approve baseline, campaign pre-warm, and load-test windows separately.
Continuity if a key engineer leaves	No critical path should depend on one person: use bus-factor-two ownership, ADRs, runbooks, staging proof, code review, and handover packs.	Delivery gates, readiness gates, risk register, decisions to lock.	Confirm named backup routes and organisation-owned access before build starts.
Post-launch contacts and support routes	One accountable AVPN service contact plus a ticketing/source-of-truth workflow and named engineering routes behind it.	Risk register, handover gate, M&E reporting promise.	Support routes, hours, SLA windows, and escalation paths are documented in the launch runbook.
Maintenance retainer	The $3,000/month retainer is an operating envelope with capped cloud, AI, tool, support, reporting, and minor enhancement assumptions.	Maintenance bridge, line-item calculator, included vs separately approved table.	Approved usage uplift or commercial amendment is needed for sustained overage or scope expansion.

How the position has strengthened

Earlier narrow framing	Stronger current framing
"Which Gemini model are we using?"	"Router-first policy: 80-90% low-cost support, escalation only when risk or complexity requires it."
"AI costs are small."	"Text token costs can be small, but grounding, voice, long context, tool loops, and always-on hosting are the real cost traps."
"Gemma/self-hosting may save money."	"Gemma is credible for control or high utilization; at 1k-10k text conversations, hosting overhead can outweigh token savings."
"5,000 concurrent sessions are included."	"5,000 is the proof envelope; ordinary months run elastic, campaign windows are approved and monitored."
"Open-source assembly."	"Governed assembly: one system of record, explicit interfaces, audit logs, runbooks, and handover-friendly primitives."
"Maintenance covers operations."	"Maintenance covers agreed operations inside caps; expanded traffic, tools, AI, languages, and LTP complexity need approval."

Operating principles

Specific enough to be accountableShow formulas, caps, assumptions, gates, and operating reports rather than broad reassurance.

Bounded enough to be credibleDo not sell unlimited AI, cloud, languages, support coverage, or provider-specific LTP customisation inside a fixed retainer.

Strategic posturePosition the response as disciplined programme governance with enough detail to support decision-making.

Executive-to-detail bridgeThe dashboard lets AVPN move from board-level assurance to the assumptions, controls and evidence behind each answer.

Scope clarity

Avoid implying	Use this instead
Full LMS scope by default.	Current scope is a learning navigation, verification, and M&E gateway. Course hosting can be added later as a separate scope decision.
Cloud, AI, translation, email, survey, video, and tooling are unlimited.	They are included only within agreed assumptions, caps, and approval paths.
The chatbot will answer anything.	The assistant is FAQ-first, retrieval-scoped, capped, and escalates only where useful.
Raw model confidence solves language risk.	Quality is governed through approved sources, feedback, sampling, language tiering, and incident review.
One vendor/model choice is final today.	Month 1 validates the route against cost, quality, language coverage, latency, and governance.

Decisions to lock

Decision	Why it matters	Recommended gate
Billing owner and approved usage-uplift route	Confirms whether the default remains Mereka-managed capped usage or a separately agreed billing structure.	Kickoff / Month 1
AI model route and fallback UX	Controls learner experience and budget behaviour under stress.	Month 1
Common LTP data schema	Prevents 25-30 providers from becoming provider-specific parser sprawl.	Month 1
Campaign calendar and pre-warm rules	Separates ordinary operations from launch/deadline surge.	Before launch campaigns
Support coverage and escalation windows	Prevents maintenance expectations from drifting into 24/7 support without pricing.	Contract finalisation

Clarification coverage map

A client-facing map of how the platform answers AVPN's due-diligence questions, with the evidence surfaces and governance controls behind each answer.

Client-facing evidence map

AVPN question	Decision concern	Mereka answer	Evidence surface	Governance control
Q1. API model-related usage assumptions	Will AI usage create surprise monthly spend?	Bound AI to support workflows, use router-first model policy, cap usage, and separate grounding/voice/self-host economics.	Strong Cost model, AI pricing, grounding, voice and Gemma economics.	Vendor pricing and real usage are reviewed during Month 1 and reported through the operating dashboard.
Q2. Architecture alternatives and open-source assembly	Is this governed architecture or fragile tool stitching?	Custom AVPN product layer over proven primitives, with one canonical system of record and explicit adapter boundaries.	Mapped Response map, architecture/data-flow map, roadmap, LTP ops and risk register.	Architecture decision record and integration boundary list are agreed in Month 1.
Q3. UX if API budget is exhausted	Will learners hit a dead end or unfair language cut-off?	Use full assistant, guided help mode, critical journey mode, admin alerts, and human handoff.	Mapped AI policy, fallback modes, cost triggers and language controls.	Fallback copy, priority journeys, thresholds and escalation rules are governed before launch.
Q4. Wrong answer in lower-resource language	How are trust, harm, fairness and correction managed?	Manage as operational quality risk: approved sources, feedback, sampling, incident review, language tiers and fallback.	Controlled AI safeguards, risk matrix, quality controls.	Production language list and review coverage are agreed before launch.
Q5. Continuity if key engineer leaves in Month 3	Can delivery continue without a single-person dependency?	Bus-factor-two ownership, runbooks, ADRs, code review, staging proof, backups and handover pack.	Governed Continuity model, roadmap gates, readiness gates and risk register.	Backup routes, organisation-owned access and runbooks are established from the start.
Q6. Day-to-day contact and maintenance engineers	Who owns requests after launch and how are they routed?	One accountable AVPN service contact, ticketing source of truth, named engineering routes and escalation windows.	Operational model Support operating model, maintenance bridge and risks/decisions.	Named service routes, hours and SLA windows are documented in the support runbook.
Q7. Maintenance retainer and 5 dev-hours	Is the fixed retainer being read as unlimited operations?	$3,000/month is the operating envelope; 5 dev-hours is a minor-enhancement sub-line; cloud/AI/tools are capped.	Strong Maintenance bridge, line-item calculator, included/separate table.	Billing owner, usage-uplift route and approval thresholds are documented before go-live.

Architecture and data-flow map

The governing rule is simple: the AVPN application layer owns programme truth; external tools integrate through explicit contracts and do not mutate programme state directly.

Q2 evidence

UsersLearner, LTP, AVPN admin

Interact through role-aware journeys for registration, recommendations, evidence, verification, reporting and support.

System of recordAVPN product layer

Owns learner IDs, profile state, course routing, LTP status, certificate/survey state, audit events and approval transitions.

Identity primitiveOIDC / role provider

Authenticates users and roles. It does not own learner progress, completion, recommendation or M&E truth.

Workflow primitiveJobs, queues, retries

Runs long processes such as evidence review, certificate issuance, notifications and surveys through application-owned contracts.

LTP boundaryProvider uploads and APIs

Use common schemas, validation, exception queues and audit logs. Provider-specific parsing is a change boundary.

AI boundarySupport gateway

Serves FAQ/retrieval first, routes model calls by risk, records privacy-safe metadata, and escalates unclear answers.

Reporting layerM&E and exports

Reads governed events and curated views for donor reporting, funnel health, LTP performance, cost and language quality.

OperationsRunbooks and observability

Deployment, rollback, monitoring, budget alerts, incident review and handover evidence keep the architecture maintainable.

Learner support modes when AI is constrained

The learner should always receive a next best action. Budget controls change the support mode; they should not create a dead end.

Q3 evidence

Mode 1: Full assistant Normal operation

Natural-language help for course discovery, FAQ, eligibility, completion and learning guidance.
Deterministic and retrieval-scoped answers are used before model calls.
Admin sees token use, route, language, feedback and escalation outcome.

Mode 2: Guided help Approaching cap or quality threshold

Suggested questions, FAQ answers, course cards, certificate help and status flows stay available.
Open-ended tutoring narrows to approved content and saved follow-up.
Learner copy frames this as keeping support available for all learners.

Mode 3: Critical journey Campaign stress or incident mode

Registration, recommendation, LTP handoff, evidence upload, certificate status and human handoff stay protected.
Exploratory AI is paused or routed to later review.
AVPN receives threshold alerts and can approve temporary capacity or AI uplift.

Continuity and support operating model

Function	Operating route	Proof / backup
Service ownership	One accountable AVPN-facing service owner receives requests, coordinates reporting and manages escalation.	Named contact, backup route, monthly report, ticket source of truth.
Engineering escalation	Engineering lead triages P1/P2 issues and routes module work to the right owner.	Escalation matrix, runbooks, access review, incident log.
Module continuity	Core modules have primary owner plus backup reviewer from the start, especially around Month 3 convergence.	Code review, ADRs, handover notes, staging proof, module checklist.
Support source of truth	Ticketing/workflow system records learner, LTP and AVPN/admin support requests; chat channels are coordination only.	Priority, owner, SLA window, status, decision record and resolution history.
Maintenance boundary	Business-hours support, dependency checks, minor fixes, reporting and five dev-hours remain inside the capped retainer.	Change-control route for broader coverage, feature scope, campaign support or extra engineering capacity.

Month 1 decision checklist

Decision	Evidence to produce	Pass condition
Architecture boundary	ADR plus system-of-record map and integration list.	AVPN agrees what owns programme truth and what remains adapter/tooling.
AI support policy	Model route, cap table, fallback copy, language thresholds and escalation rules.	Retainer-safe default plus explicit approval route for uplift.
LTP data contract	Upload template, validation rules, exception states and audit fields.	25-30 providers can be governed through a common workflow.
Support model	Named contact routes, ticket workflow, service hours, priority definitions and backup owners.	AVPN knows who is on point after launch without implying unlimited coverage.
Commercial guardrails	Billing owner, usage-uplift rules, campaign uplift path and monthly reporting format.	Costs are governable before launch traffic begins.

What AVPN can test

Stress-test the assumptionsAVPN can change MAU, model route, grounding, campaign peaks, and see when the retainer breaks.

Show the operating modelThe dashboard turns scale, cost, AI routing, fallback modes and support coverage into visible controls.

Make governance visibleCaps, approval triggers, source-of-truth boundaries and Month 1 gates are readable without digging through notes.

Support executive reviewAVPN can move from board-level assurance to detailed assumptions in one artifact.

Governance evidence register

Client question	Evidence in the dashboard	Governance answer
Are costs traceable?	Usage controls, line items, formulas, scenario sensitivity and approval triggers.	Monthly reporting reconciles actual usage against approved caps.
Where does data live?	Architecture map shows the AVPN product layer as system of record and tools as bounded primitives.	One operational source of truth with documented interfaces and audit history.
What happens when AI is constrained?	Support modes show full assistant, guided help and critical journey behaviour.	Learners get guided help and human handoff rather than a dead end.
Who is accountable after launch?	Continuity model shows service ownership, engineering escalation, module backups and support source of truth.	Service ownership, engineering escalation and backup coverage are explicit.
How does Month 1 reduce risk?	Decision checklist connects open questions to evidence and pass conditions.	Open design choices become signed operating controls before launch commitments harden.

How to read this dashboard

Review mode	Where to look
Executive overview	Use Executive and Due Diligence for the board-level answer.
Cost review	Use Cost Model to test AI route, grounding, token profile and campaign usage.
Architecture review	Use Decision Structure and Delivery Gates to inspect source-of-truth and handover controls.
Operating review	Use LTP Operations, Reporting & M&E, and Risk Register for post-launch governance.

Evidence and governance controls

Control	Evidence shown	Why it matters
Vendor pricing references	Linked official vendor references and model-family pricing controls.	Keeps price changes visible before they affect the retainer.
Cost line traceability	Line-item calculator separates cloud, AI, tooling and labour.	Makes approval triggers visible.
Support routes	Operating routes are shown as roles and workflow surfaces.	Final named contacts and SLA windows live in the support runbook.

Demand logic

The concurrency estimate is driven by active learners, gateway minutes, and concentration.

Programme target300,000

Registrations across the programme period.

Programme months37

Used for planning pace.

Monthly pace8,108

Not the same as simultaneous users.

Current MAU25,000

Scenario-controlled monthly active learners.

Gateway time10 min

Time spent in the AVPN gateway, not external course time.

Peak sessions58

Computed scenario peak.

Formula

Average concurrent = MAU x gateway minutes / active minutes per month. Peak concurrent = average concurrent x peak concentration factor.

Scenario ladder

Generated from the same formula as the controls, so the table reconciles with the calculator.

Scenario	MAU	Gateway min	Peak factor	Peak sessions	Interpretation

Cloud Run request-load model

Gateway sessions explain programme demand; Cloud Run cost must be validated from request shape and configured capacity.

Month 1 validation

Layer	Formula / assumption	Status
Learner concurrency	Average active gateway sessions = MAU x gateway minutes / active minutes per month. Peak active gateway sessions = average x peak factor.	Used for readiness planning
Gateway request load	Monthly gateway requests = MAU x sessions per active learner x API requests per session.	API requests/session TBC
Peak request rate	Peak active sessions x API requests per session / session duration seconds x burst factor.	Duration and burst factor TBC
Cloud Run capacity	Required instances = ceil((peak request rate x average request duration seconds) / Cloud Run concurrency setting).	Concurrency, CPU/memory and min instances TBC
Cloud Run cost	Request charges + active vCPU-seconds + active memory GiB-seconds + configured min-instance idle cost + load balancer/network/egress allowance.	Google Pricing Calculator validation

What would change the answer

Trigger	Why it matters	Commercial response
All learners pushed through one synchronized deadline	Peak factor rises sharply.	Temporary campaign capacity window.
Gateway becomes full LMS hosting	Session time and storage increase.	Separate architecture and cost model.
Permanently reserved 5k capacity requested	Idle capacity becomes the cost driver.	Separately priced reserved-capacity option.
External course providers return inconsistent data	Verification queues and support load rise.	LTP schema governance and change-order rules.

Base cost assumptions and submitted budget stack

These lines must stay distinct in client conversations.

Bucket	Amount	Meaning
Development	$154,000 one-time	Build phase, June to November 2026.
Maintenance retainer	$3,000/month	Operating retainer across the maintenance period.
GCP infrastructure sub-line	$700/month	Baseline cloud platform allowance.
Vertex/Gemini AI sub-line	$250/month	Capped AI usage allowance, separate from dev-hours.
Other tool/API sub-lines	$450/month	Email, video testimonial, survey, translation, and related tools.

Commercial guardrail

The retainer covers agreed operations within approved assumptions, with caps for cloud, AI, translation, email, video, survey, and feature scope.

Current scenario vs allowance

Only cloud and AI are modelled here; other tool/API lines remain separate.

Within envelope

Cloud estimate floor$430

Planning floor; $700 remains the capped allowance.

AI estimate$47

Against $250/month AI allowance.

Tool/vendor subtotal$927

Informational; cap decisions use separate lines.

Infra usage61%

Of $700 allowance.

AI usage19%

Of $250 allowance.

Maint. total$2,527

Labour + vendor/tool lines.

Inside approved operating envelope

This scenario fits the submitted monthly retainer under the stated assumptions.

Cost assumption controls

These variables drive the AI line and reveal when a change order is needed.

Chatbot adoption 30%

Sessions/user 1.5

Turns/session 6

Model responses/session 2.5

Token profile Response

Google Search / Maps prompts 0%

Chatbot users7,500

MAU expected to use the assistant.

Model-priced units28,125

Monthly model-routed responses priced by model.

Support turns67,500

Used for grounding and operational load.

Token AI cost$47

Input/output token spend before grounding.

Grounding cost$0

Search/grounding uplift after free allowance.

Maintenance bridge

Shows whether the selected scenario fits the $3,000/month retainer.

Retainer-safe

Monthly budget$3,000

Submitted maintenance retainer.

Monthly model$2,527

Calculated all-in maintenance line.

Monthly status$473

Inside assumptions; not a credit or scope increase.

32-month budget$96,000

Submitted maintenance total.

32-month model$80,856

If this scenario held every month.

Overage/month$0

Requires approval if above $0.

Unit cost assumptions

The visible formula layer behind the monthly model.

Formula visible

Assumption	Value used	Why it matters
Chatbot usage volume	MAU x adoption x sessions/user x model responses/session	Model-routed support units, not registrations, drive the text-token AI line.
Token profile	800 input tokens + 500 billable output tokens per model-routed response	Billable output includes thinking/reasoning tokens where charged; switch to conservative full-conversation stress testing when needed.
Default volume route	Gemini 3.1 Flash-Lite: $0.25 / 1M input + $1.50 / 1M output	Recommended default for most production learner-support conversations.
Escalation route	Gemini 3.5 Flash: $1.50 / 1M input + $9.00 / 1M output	Use when ambiguity, risk, confidence, or quality requires it.
Routed policy	85% Gemini 3.1 Flash-Lite + 15% Gemini 3.5 Flash	Near-premium UX without paying premium pricing on every conversation.
Deployment pricing basis	Public Gemini Developer API-style pricing as a planning reference	Final production pricing must be revalidated in Month 1 against the selected Google Cloud / Vertex / Gemini route, region, billing mode, grounding method, and model policy.
Google Search / Maps grounding uplift	Gemini 3: first 5,000 Google Search prompts/month free, then $14 / 1,000 prompts	External live web/search grounding is the fastest way to break the $250 AI line.
Approved FAQ/catalogue retrieval	Separate from live web/search grounding	AVPN-approved FAQ and catalogue retrieval is treated as retrieval-scoped support, not priced through the Google Search grounding slider.
Cloud cost basis	Planning floor below: Cloud Run reserve + Cloud SQL HA + storage/backups + observability/CDN + contingency.	The $700 infrastructure line remains the contractual allowance for final region, HA, logs, egress and reserve uncertainty.

Cloud cost bill of materials

The selected cloud estimate is a planning floor pending Google Pricing Calculator confirmation. Cloud Run compute is usually modest; database, observability, warm capacity and reserves drive the baseline. The $700/month infra allowance remains the commercial cap, not a promise that every month bills at the floor estimate.

Calculator check

Line	Assumption / formula	Baseline	Selected	Confidence

How certain is this?

Medium for planning. The order of magnitude is defensible, but the final number should be rerun in the Google Pricing Calculator after region, Cloud SQL HA, min instances, request duration, concurrency, log retention, CDN/egress and AVPN billing ownership are locked.

Development budget breakdown

One-time build cost by workstream, separated from maintenance.

$154k build

Workstream	Budget	Client meaning

Contract value and scenario status

Shows whether the selected scenario stays inside submitted operating assumptions without implying credits or extra scope.

Inside assumptions

Submitted contract value$250,000

$154k build + $96k maintenance.

Scenario statusInside assumptions

Inside/outside submitted operating model.

Approval triggerNone

Usage uplift or commercial amendment if required.

Cost/target registration$0.83

Submitted contract value over 300,000 registrations.

Retainer/MAU$0.12

$3,000 monthly retainer over selected MAU.

AI cost/1k units$9.63

AI line divided by model-priced units.

Line-item cost calculator

Monthly and 32-month values are calculated from the selected scenario.

Line	Budget/mo	Model/mo	Delta/mo	32-month model	Status

Scenario cost comparison

Scenario	Peak	Cloud	AI	Maint./mo	Status

AI route sensitivity

Route	AI/mo	Delta vs $250	Read

Published model price comparison

Text-only estimate using the selected token profile.

Token-only

Model	Cost / unit	1k units	5k units	10k units	Strategic read

Batch / Flex economics

Useful for offline reporting, classification, and bulk processing; not the default for live learner chat.

Model	1k units	5k units	10k units	Use when

Voice and self-host thresholds

Option	1k	5k	10k	Read

Approval math

These rows translate usage choices into monthly overage and decision language.

Decision trigger	Added cost/mo	If 32 months	Client decision

Included vs separately approved

Inside retainer when capped	Separately approved usage uplift
Business-hours support, QBR/reporting, 5 dev-hours, dependency/security checks.	24/7 coverage, new feature scope, major integrations, or additional dev capacity.
Elastic baseline cloud and ordinary storage/monitoring within the $700 infra line.	Always-on 5k reserve, dedicated cluster, campaign pre-warm, or load-test windows.
Routed AI support within the $250 AI line and approved fallback behaviour.	High-quality model everywhere, broad web grounding, unusually high chatbot adoption, or no per-session cap.
Standard email, video testimonial, survey, and translation tool allowances.	Large notification blasts, heavy media usage, new survey tools, or expanded translation volumes.
Common LTP upload schema, validation, queue ageing, and agreed reporting.	Provider-specific parser logic, custom LTP workflows, or remediation of poor source data.

Infra mode choices

Mode	What it means	Commercial status
Elastic baseline	Normal Cloud Run/GCP operation within agreed traffic, storage, logging and AI assumptions.	Included within $700 infra allowance, subject to calculator validation
Campaign pre-warm	Temporary capacity, logging and support uplift for a planned traffic window.	Pre-approved uplift if above cap
5,000-session load proof	Time-bound load test and readiness validation.	Build/UAT proof activity, not monthly steady state
Always-reserved 5,000 capacity	Dedicated or permanently pre-warmed infrastructure.	Separate commercial model

Cost escalation triggers

Always-on reserved peakIf AVPN wants guaranteed 5k capacity every month, price it as a reserved-capacity option.

Higher-cost AI defaultFlash or Pro everywhere breaks the $250 AI line under many campaign assumptions.

General web groundingExternal search should be an approved uplift, not default behaviour for every turn.

New languages or custom LTP parsersThese are scope and operations changes, not invisible retainer work.

AI route comparison

Same traffic, different model policies.

Model routing keeps high-volume support away from expensive defaults while preserving escalation quality.

Support policy

Layer	Default behaviour	Control
FAQ first	Approved catalogue and platform guidance answer common questions.	No token spend where static support is enough.
Low-cost route	Bounded navigation and discovery support.	Per-session and monthly caps.
Escalation route	Ambiguous, sensitive, or quality-critical cases.	Stronger model, sampled review, budget visibility.
Guided help mode	If caps are approached, assistant degrades gracefully.	Learners are not shown blunt budget failure language.

Language quality risk

Lower-resource languages need review queues and explicit fallback rules.

Quality controls

Grounding signalTrack whether answers came from approved FAQ/catalogue content or fallback.

Feedback by languageCapture unhelpful reports, issue type, and time-to-resolution.

Human samplingUse native-speaker or LTP-assisted review before declaring a language production-ready.

Post-incident reviewMaterial AVPN/LTP-reported quality issues should receive a documented review within 5 business days.

LTP verification load

The 25-30 LTP challenge is workflow governance, not mainly compute.

Operating model

Common upload templateRequire shared learner identifiers, completion status, completion date, and evidence fields.

Validation before importReject schema errors early; route ambiguous records to an on-hold queue.

Audit trailTrack who uploaded, what changed, and why records were accepted or rejected.

Change-order boundaryProvider-specific parser logic beyond the agreed schema should be separately approved.

Monthly LTP operating rhythm

Cadence	Surface	Decision it supports
Weekly	Upload success, rejected rows, schema exceptions, queue age.	Which LTP needs support before reporting quality degrades.
Monthly	Completions, certificates, unresolved verification cases, LTP responsiveness.	Which delivery partners create programme risk.
Campaign window	Daily throughput, peak queue depth, error types, support tickets.	Whether to extend campaign capacity or intervene operationally.
Quarterly	Data quality trend and schema-change requests.	Whether the common template needs governance updates.

Reporting dashboard spine

These are the client-facing reporting modules the platform should make visible after launch.

Module	Questions answered	Core fields
Learner funnel	Where do learners drop off?	Registration, recommendation, click-through, start, completion, certificate, survey.
Programme reach	Who is being reached across markets?	Country, cohort, language, approved demographic fields, LTP, course.
LTP delivery	Which partners are operationally healthy?	Uploads, completions, rejected rows, verification ageing, support exposure.
AI and language quality	Where is the support layer risky?	Usage, escalation, fallback, feedback, language tier, issue ageing.
Finance guardrails	Are costs inside the approved envelope?	Cloud, AI, translation, email, survey, campaign uplift, approval status.

Reporting promise

Monthly operating reportCost, learner funnel, LTP queue health, language risk, and incidents.

Donor-ready exportClean CSV/XLSX export for agreed outcomes and disaggregation fields.

Exception registerSeparate normal programme movement from data quality, integration, or cost exceptions.

Governance reviewUse the data to decide capacity, AI caps, LTP support, and language QA investments.

Positioning

This governance layer makes the fixed retainer, LTP network, and multilingual AI support manageable.

Delivery roadmap and gates

A five-month build only stays credible if each phase has a visible proof gate and a named decision it resolves.

Gate-led delivery

Month 1 Confirm operating assumptions

Lock scope, billing stance, AI policy, languages, and LTP data rules.

Architecture decision record
Common LTP schema
AI caps and fallback policy

Month 2 Build core gateway

Implement learner registration, recommendation/routing, identity, and admin foundations.

Staging environment live
Audit log baseline
First funnel dashboard

Month 3 Wire LTP and evidence flow

Validate uploads, completion evidence, certificates, queue ageing, and exception handling.

LTP pilot upload
Verification queue
Certificate workflow

Month 4 Harden AI and reporting

Operationalize language QA, cost dashboards, graceful fallback, and M&E exports.

Language review pack
Cost guardrail alerts
Donor-ready export

Month 5 Prove readiness and handover

Run load proof, UAT, support rehearsal, runbook review, and go-live decision gate.

5,000-session load proof
UAT sign-off
Handover pack

Readiness gates

Gate	Pass condition	Owner decision
Cost gate	Infra, AI, translation, email and survey usage have caps and reporting.	Approve baseline and any usage-uplift route.
Scale gate	Load test documents 5,000-session envelope and campaign pre-warm runbook.	Approve launch/campaign window.
Language gate	Tiered language QA, fallback, feedback and incident review are live.	Approve production language list.
LTP gate	At least one pilot upload proves validation, queue ageing and audit trail.	Approve common upload schema.
Handover gate	Runbooks, ADRs, backups, org-owned access and support routes are reviewed.	Approve maintenance transition.

What AVPN should see before go-live

Working learner journeyRegistration, recommendation, LTP handoff, completion evidence and certificate path.

Operating dashboardFunnel, cost, AI quality, language risk, LTP queue health and exception register.

Proof packLoad-test result, UAT evidence, pilot LTP upload, language review sample and runbooks.

Decision logBilling model, AI model route, support coverage, campaign rules and change boundaries.

Risk register

Risk	Impact	Mitigation
Retainer interpreted as unlimited usage	Commercial	Hard caps, assumptions table, approval workflow.
5k treated as steady-state capacity	Cost	Load-test proof target plus campaign pre-warm option.
Low-resource language answer is wrong	Trust	FAQ grounding, feedback, review sampling, fallback and incident process.
LTP CSVs are inconsistent	Operations	Common template, validation, on-hold queue, audit trail.
One-person technical dependency	Continuity	Runbooks, ADRs, code review, org-owned access, named backup coverage.
Exact pricing drifts before submission	Evidence	Refresh vendor pricing before final commercial quote.

Decisions to close

Billing modelAVPN-owned GCP project, Mereka-managed capped usage, or hybrid.

5k interpretationConfirm load-test proof target versus permanently reserved capacity.

AI policyApprove model routing, hard caps, grounding limits, and graceful fallback.

LTP data rulesApprove common upload schema and provider-specific change boundaries.

Support commitmentConfirm business-hours plus P1/P2 escalation or price broader coverage.

Assumption sources

AVPN TOR and clarification trailDefines the requested learning navigation/data-management platform, 300,000 registration scale, multilingual support, LTP network, and 5,000-session readiness expectation.

Submitted commercial proposalDefines the $154,000 build budget, $3,000/month maintenance model, $700 infra line, $250 AI line, and other tool/API allowances.

Mereka clarification response modelDefines the bounded AI support posture, fallback UX, continuity practices, maintenance framing, and lower-resource-language quality controls.

Mereka finance and concurrency modelTranslates registration volume into MAU, gateway minutes, peak concentration, cloud cost, AI route cost, and approval thresholds.

Vendor pricing references

Exact prices and product capabilities are governed by the live vendor references used for commercial reconciliation.

Google Cloud Run pricingcloud.google.com/run/pricingUsed to anchor elastic baseline assumptions.

Google Cloud SQL pricingcloud.google.com/sql/pricingUsed for database CPU, memory, HA, storage and backup assumptions.

Google Cloud Observability pricingcloud.google.com/products/observability/pricingUsed for logging, monitoring and trace cost-control assumptions.

Google Kubernetes Engine pricingcloud.google.com/kubernetes-engine/pricingUsed to compare Autopilot and reserved-capacity alternatives.

Gemini Developer API pricingai.google.dev/gemini-api/docs/pricingUsed as a public paid-tier planning reference for text-token rates, billable output tokens and Search/Maps grounding.

Gemini / Vertex generative AI pricingcloud.google.com/gemini-enterprise-agent-platform/generative-ai/pricingUsed to compare low-cost, routed, and approval-gated premium AI policies.

Cloud Run GPU docsdocs.cloud.google.com/run/docs/configuring/services/gpuInternal lab reference for L4 and RTX minimum CPU/memory assumptions.

Google Gemma docsai.google.dev/gemma/docsInternal lab reference for open-model route exploration, with hosting costs treated separately.

Planning note

Contractual values are reconciled against approved scope, region, billing ownership, selected AI model policy, campaign assumptions, and live vendor pricing.