vojo/docs/ai/android.md

18 KiB
Raw Blame History

Android (Capacitor)

Requirements

  • Node >= 22
  • JDK 17+ (21 used in practice)
  • Android SDK with platform 36 + build-tools 36.0.0
  • SDK location: /usr/lib/android-sdk, also set in android/local.properties

Config

  • capacitor.config.tsappId: chat.vojo.app, webDir: dist
  • android/ — generated Android Studio project, targetSdkVersion 36, compileSdkVersion 36, minSdkVersion 24

Build scripts

npm run build:android:debug    # full chain: build → sync → debug APK
npm run build:android:release  # full chain: build → sync → release APK
npm run build:android:aab      # full chain: build → sync → release AAB
npm run android:sync           # sync dist/ → android assets
npm run android:apk:debug      # gradle debug build only

APK output: android/app/build/outputs/apk/debug/app-debug.apk

Versioning

versionCode and versionName are derived from git describe --tags --match 'v*' in android/app/build.gradle, mirroring resolveAppVersion() in vite.config.js so the APK's versionName matches __APP_VERSION__ shown in About. Tag is v0.2.0; patch is the commit count since that tag (e.g. v0.2.0-87-g… → versionName 0.2.87). When git is unavailable, falls back to package.json version.

versionCode = major * 1_000_000 + minor * 1_000 + patch

Key architecture decisions

  • Bundled build. dist/ is copied into the APK — not loaded remotely in a WebView.
  • Service Worker stays active. Critical for authenticated Matrix media (MSC3916 / Matrix spec v1.11+). DO NOT disable. resolveServiceWorkerRequests default true.
  • Edge-to-edge. EdgeToEdge.enable() in MainActivity.java + windowLayoutInDisplayCutoutMode: shortEdges.
  • External links. Opened via @capacitor/browser plugin — see src/app/utils/capacitor.ts.
  • Safe-area coloring. body background-color is bound to the folds theme variable var(--oq6d070) for consistent safe-area coloring.
  • Safe-area insets. Applied on #root (not body) so the theme background extends behind the system bars.

VSCode tasks

See .vscode/tasks.json:

  • Deploy to vojo.chat (Ctrl+Shift+D) — web deploy
  • Deploy to Android (ADB) (Ctrl+Shift+A) — build + adb install

Push string resources (generated)

Push notification text for Android is generated from public/locales/{en,ru}.json (namespace Push) by scripts/gen-push-strings.mjs. The Gradle build runs this automatically via GeneratePushStringsTask registered in android/app/build.gradle through AGP addGeneratedSourceDirectory — output goes to build/generated/res/push/<variant>/values{,-ru}/push_strings.xml. No manual step needed; ./gradlew assembleDebug handles it.

The task requires node in PATH. Terminal builds and CI inherit it from the shell. macOS Android Studio with nvm/fnm: the GUI app may not see nvm-managed node. Workaround: set NODE_BIN=/path/to/node in android/gradle.properties (the task reads it via project.findProperty('NODE_BIN')) or launch AS from a shell that sources your node manager (open -a "Android Studio").

Push polling fallback (WorkManager)

Users on networks that block FCM (mtalk.google.com:5228 — corporate, school and government whitelist intranets, ~5% of our audience) get zero pushes from the primary channel. To cover them we run a WorkManager periodic poll of /_matrix/client/v3/notifications as a parallel best-effort delivery channel. Always on whenever push is enabled — there's no smart-detect-and-switch (FCM gives no client-visible delivery receipts; see push_unifiedpush_phase1.md §11 for the full rationale of why this is the only viable shape).

Components:

Layer File Role
Worker VojoPollWorker.java Periodic fetch of /notifications, flattens response into Sygnal-shape Map<String,String>, routes message/invite → renderMessageNotification, RTC ring → renderMissedCallNotification. Skips events that are read=true, push-rule-suppressed (actions lacks notify), in NotificationDedup, or with ts < watermark. Foreground-gated: doesn't render system notifications while MainActivity.isInForeground (still consumes state). Saves a drain cursor when capped at MAX_PAGES_PER_RUN.
Bridge PollingPlugin.java Capacitor plugin. JS calls saveSession (token + homeserver, seeds watermark on first use to skip historical backlog), schedule(15) (unique periodic worker), saveRoomNames (room-id → name cache), cancel (awaits WorkManager Operation completion) + clearSession on disable/logout.
Renderers VojoFirebaseMessagingService.java::renderMessageNotification, ::renderMissedCallNotification Static, Context-parameterised so the Worker can post into the same notification id space as FCM. Message path uses a per-room roomId.hashCode() slot — every new event in a room appends to a MessagingStyle conversation rather than stacking as a separate card (see MessagingStyle pipeline below). Missed-call path uses per-event slots so multiple missed rings stack. After successful nm.notify, mark the event in NotificationDedup so the polling Worker doesn't re-surface it after the user dismisses an FCM-delivered one.
Dedup NotificationDedup.java Thread-safe shared LRU set of rendered event_ids. Written by both FCM service (background renders AND foreground-skipped events) and Worker (after successful render or foreground-skip). Bounded at 500 entries to comfortably exceed a single Worker run's worst case (MAX_PAGES_PER_RUN × PAGE_LIMIT = 250), persisted in vojo_poll_state SharedPreferences.
JS plugin src/app/plugins/polling.ts registerPlugin<PollingPluginIface>('Polling', { web: noop }). Web has no analogue (SW already wakes for push) — fallback is a no-op.
Lifecycle src/app/hooks/usePushNotifications.ts::usePushNotificationsLifecycle Reactive to usePushEnabled(). On mount with push enabled: saveSession + schedule + initial room-name dump. On visibilitychange → visible: re-saveSession (recovers a 401-cleared credentials slot without remount) + re-dump room names. On unmount or push disable: cancel + clearSession.

Why polling is rendered as missed call (not CallStyle) for ring events: the m.rtc.notification lifetime is 30 seconds; polling runs at the 15-minute floor of PeriodicWorkRequest. Every ring observed by the Worker is already stale and the live call long over — rendering CallStyle with ringtone would phantom-ring a dead call. Missed-call style preserves the "you missed a call from X" signal without the wrong UX. Live-call delivery for whitelist users remains a gap; closing it requires a non-FCM live channel (UnifiedPush, see the stale plan above).

Why we do not need a refresh-token flow: Vojo's homeserver is vanilla Synapse without MAS/OIDC (see server-side.md), so access tokens are long-lived. A 401 from the Worker logs out the credentials slot and waits for the next foreground app launch to re-bridge — no native refresh-token logic required. If we ever migrate to MAS, the Worker needs a refresh path.

Why our source manifest does not declare RECEIVE_BOOT_COMPLETED: WorkManager's library manifest already declares the permission and the RescheduleReceiver, which the manifest merger folds into the merged manifest. Reboot persistence works end-to-end without our app re-declaring anything. Apps only need to add the permission themselves when they listen for BOOT_COMPLETED for their own purposes.

Edge cases handled:

  • Token rotation (post-MAS migration): currently not bridged from JS to native on token-rotate events. JS re-saves credentials on every lifecycle re-mount AND on visibilitychange → visible, so user-driven re-open recovers within seconds. After a 401 the Worker clears its credentials slot; after a 403 it leaves credentials alone and just skips the cycle (403 is most often a transient rate-limit, not a dead token).
  • First fire after install / re-login: saveSession seeds KEY_LAST_SEEN_TS to System.currentTimeMillis() - 60s on first write, so the Worker doesn't render every historical unread /notifications entry as a fresh push. The 60s buffer tolerates device-clock drift ahead of the homeserver (event ts is server-side); without it a fast-clock device would silently skip fresh events as "older than watermark".
  • POST_NOTIFICATIONS revoked at runtime: Worker bails early on NotificationManagerCompat.areNotificationsEnabled() == false. Without this guard nm.notify would throw SecurityException per event, leave the LRU and watermark unadvanced, and re-walk the same backlog every 15 minutes until the user re-grants permission.
  • Worker > 10 minutes (Android kill timer): bounded by MAX_PAGES_PER_RUN=5 × PAGE_LIMIT=50 + 30s HTTP timeout per call. Cannot exceed ~3 minutes in normal operation. Most polls touch only a single page because the ts watermark short-circuits the loop.
  • Large backlog (>250 events accumulated while offline): when a single fire hits MAX_PAGES_PER_RUN before reaching the watermark, the Worker saves the leftover next_token as KEY_DRAIN_CURSOR AND snapshots the head ts of the first run as KEY_DRAIN_TARGET_TS. Subsequent fires resume from that cursor instead of head; the target ts is the fast-forward destination for the watermark when drain finally completes — without it, the bounded LRU could evict head events and let the post-drain normal run re-render them.
  • Network unavailable: NetworkType.CONNECTED constraint skips the run; next cycle retries.
  • Doze: WorkManager honours maintenance windows. No catch-up — only the next scheduled fire delivers the accumulated backlog. The Worker walks from the head of /notifications and stops as soon as it reaches the watermark, so a Doze-extended gap just produces a larger first-page walk.
  • Pagination assumes newest-first ordering (Vojo runs vanilla Synapse, whose get_push_actions_for_user issues ORDER BY stream_ordering DESC). The Matrix spec for /notifications does not formally mandate this ordering, so if Vojo ever migrates to a homeserver implementation that paginates oldest- first (Conduit, Dendrite, …) the ts < watermark break would clip new events. Revisit the Worker before any such migration.
  • Already-read events (user read on another client) are skipped via the read field on each /notifications entry; their ts still advances the watermark so they don't get re-walked next poll.
  • Muted rooms: actions array on each /notifications entry is consulted; events without notify (i.e. dont_notify from a mute push rule) are skipped. Without this, the mute toggle wouldn't actually mute polling- delivered notifications even though Sygnal honours it for FCM.
  • User in foreground: Worker doesn't render system notifications while MainActivity.isInForeground (live timeline owns UX). State still advances so events don't replay on the next backgrounded poll.
  • FCM + polling double delivery: NotificationDedup is the single source of truth — FCM service and Worker both write to it after successful render, both read it before posting. Even if the user dismisses an FCM-delivered notification before polling fires, the Worker skips it.
  • UTF-8 multi-byte boundaries: readAll accumulates raw bytes and decodes the full buffer once, never per-chunk; otherwise a Cyrillic character straddling an 8 KB read boundary would become U+FFFD.
  • Logout race: initMatrix.ts::logoutClient, clearLocalSessionAndReload, and the SessionLoggedOut listener in ClientRoot.tsx all call polling.cancel() + polling.clearSession() synchronously before window.location.replace, so the Worker can't fire one more time with the stale access_token. cancel() awaits the WorkManager Operation so a fast disable → re-enable cycle doesn't race the KEEP policy. The lifecycle effect's unmount cleanup repeats the same calls as belt-and-suspenders.

Cleanups invoked symmetrically across every logout path: useDisablePushNotifications, logoutClient, clearLocalSessionAndReload, the SessionLoggedOut listener, and the lifecycle effect's unmount all call polling.cancel() + polling.clearSession().

MessagingStyle pipeline

Background-rendered message notifications use NotificationCompat.MessagingStyle so multiple events in one room collapse into an expandable conversation card (WhatsApp / Telegram convention) rather than each event posting a separate banner. Notification id is per-room (roomId.hashCode()), not per-event.

Components:

Layer File Role
Cache RoomMessageCache.java Thread-safe ConcurrentHashMap<String, ArrayDeque<Entry>> bounded at 20 messages × 200 rooms. Snapshot is taken INSIDE compute() so a concurrent FCM + Worker append on the same room can't race the copy. Mutated by both VojoFirebaseMessagingService.renderMessageNotification (FCM service path AND Worker path through the same static helper) and appendOutgoingMessage (ReplyReceiver echo).
Channels vojo_messages_dm_v1 (IMPORTANCE_HIGH) + vojo_messages_group_v1 (IMPORTANCE_DEFAULT) under NotificationChannelGroup("vojo_messages_v1"). Legacy vojo_messages is deleted on first creation of v1. Channel split lets users mute group-room noise in OS settings without losing DM alerts.
Metadata snapshot JS bridges {roomId: {name, isDirect, isEncrypted}} via polling.saveRoomNamesKEY_ROOM_NAMES in vojo_poll_state. loadRoomMetadata parses tolerantly (legacy roomId: "name" falls back to isDirect=true, isEncrypted=true for safety). Re-dump triggers: mount, visibility-change, ClientEvent.AccountData for m.direct, RoomEvent.Timeline filtered to m.room.encryption.
Process-kill recovery On cache miss, seedCacheFromActiveNotification calls NotificationCompat.MessagingStyle.extractMessagingStyleFromNotification on the on-shade StatusBarNotification to rebuild prior history. Survives process kill; fails gracefully to single-message conversation if the notification was also dismissed.
Receivers MarkAsReadReceiver (POST /_matrix/client/v3/rooms/{roomId}/receipt/m.read/{eventId} + dismiss), NotificationDismissReceiver (swipe → clear cache so the next push starts fresh), ReplyReceiver (RemoteInput → PUT m.room.message with m.text body + optimistic local echo). All read credentials from vojo_poll_state SharedPreferences (same lifecycle as VojoPollWorker).
Receipt-driven dismiss JS mx.on(RoomEvent.Receipt) filters own-user receipts, checks room.getUnreadNotificationCount(Total) === 0, calls polling.dismissRoom(roomId) → native nm.cancel + RoomMessageCache.clear. Mirrors element-web's Notifier.onRoomReceipt. Killed-process dismiss is not covered (no JS context to observe the receipt) — acceptable: the next FCM push to that room renders a fresh conversation from cache-empty state.

Why MessagingStyle vs the old per-event flow: 5 messages in one DM previously produced 5 separate cards in the shade with redundant title/avatar. The MessagingStyle conversation matches WhatsApp/Telegram UX and is the documented Android pattern for messaging apps. See element-android's RoomGroupMessageCreator for the canonical reference.

Why two channels (DM + group) and not per-conversation channels (the fluffychat approach): per-conversation works for low-room-count clients but proliferates user-visible settings entries on a Matrix client with dozens of active rooms. Element-android sidesteps the question with a NOISY/SILENT split based on push rules; we picked a middle ground — bucketed by DM vs group room — which mirrors fluffychat's directChats/groupChats NotificationChannelGroup setup.

Why reply action is gated on !isEncrypted: the Java path has no key material to sign + encrypt outgoing replies with, so an inline reply in an E2EE room would send cleartext (Synapse does not enforce the "encrypted-only" rule, so the leak is real). The snapshot defaults to isEncrypted=true on cache miss and the JS side re-dumps on m.room.encryption state events so the action is dropped within seconds of a room being switched to E2EE.

Why call-session composite dedup (compositeCallDedupKey(roomId, sessionId)): the legacy per-eventId dedup misses re-rings of the same call session because each ring is a fresh m.rtc.notification event with a new event_id. We extract the parent call event_id from content.m.relates_to.event_id (Worker JSON parse) / content_m.relates_to_event_id (FCM Sygnal-flatten) and mark the composite in NotificationDedup the moment we post the first CallStyle. Subsequent ring events for the same session see the mark and skip silently. Mirrors element-web's getIncomingCallToastKey pattern.

Why edit-collapse (m.replace) is NOT implemented: requires parsing content.m.relates_to.rel_type == "m.replace" + finding the original event in the per-room cache and replacing in place. The complication: FCM payloads (Sygnal-flattened) encode nested keys inconsistently across deployments (content_m.relates_to_rel_type vs content_m_relates_to_rel_type vs dot-preserved variants), and the Worker parses raw JSON cleanly while FCM hits one of the flattened shapes. Asymmetric handling (Worker only) creates user-visible drift between delivery paths. Real-world impact is low — users rarely edit notification-flagged messages in the seconds-long window before they're read — so the feature is deferred until we have a uniform key shape from Sygnal config or until a real-world report justifies the parser complexity.

  1. On the phone, enable Wireless debugging, tap "Pair device with pairing code" — note IP, port, 6-digit code.
  2. adb pair <ip>:<pair-port> <code>
  3. adb connect <ip>:<connect-port>

The pair port and the connect port are different — don't mix them up.