Author
Ali Arbab
Project
03 / MagLock Protocol
Status
Hardware-dependent — video demo
Navigate
MagLock Protocol
Two-door ESP32 + Flutter smart lock — LAN-only, no cloud, no telemetry.
ESP32 firmware drives the dual-door magnetic-lock relays; an ESP32-CAM streams MJPEG over the same closed subnet; a Flutter app stitches them together over plain HTTP on 192.168.4.x. Fail-secure boot order, 800ms relay-fire cooldown, hand-rolled JPEG SOI/EOI byte-stream decoder, and an optional Hinglish voice assistant with five-layer persistent memory. The homeowner owns the firmware.
DOOR 1
LOCKED
DOOR 2
LOCKED
▷ idle — tap a door to begin
ESP32 Lock Controller
PORT 80ONLINE · 192.168.4.100
- GET/status
- POST/lock
- POST/unlock
- POST/timer
ESP32-CAM AI-Thinker
PORT 80ONLINE · 192.168.4.101
- GET/stream
- GET/capture
- GET/led
Source · Flutter · Dart · ESP32 · Arduino C++
- ESP32 (Arduino IDE, ArduinoJson)
- ESP32-CAM (FreeRTOS, hardware JPEG)
- Flutter (Dart 3+, Material 3)
- Provider state
- speech_to_text + flutter_tts
- Inno Setup installer
- Grok-3 (optional cloud LLM)
- Plain HTTP/1.1, no TLS
A locked door is the most basic privacy primitive a home has — it shouldn't be conditional on a third party's business model.
Problem
Every consumer smart lock I looked at routed door-state data through a server in some other country, behind a vendor account that could be canceled, throttled, or monetized at will. A locked door is the most basic privacy primitive a home has. It shouldn't be conditional on a third party's business model.
Why me
I had two ESP32 boards left over from another project, a recurring household problem where someone always forgot to lock up, and a stubborn objection to any 'smart home' architecture that reaches outside the local network to do its job. The hardware was lying around. The principle wasn't.
Learned
Two ESP32s talking REST over a household WiFi network is harder than two ESP32s talking REST in a controlled testbed. Power-cycle resilience, cooldown debouncing, and 'what happens when the WiFi router reboots at 3am' turned out to be 70% of the firmware. Also: hardcoded credentials are the easiest mistake to make and the hardest to walk back from in public — that became its own post-mortem.
More on Ali's journey & projects
/about →Three independent components on one closed network.
ESP32 lock controller — station-mode WiFi with a fixed IP (192.168.4.100), holding two relays and serving a tiny synchronous HTTP API on port 80. Routes: GET /status, POST /lock?relay={1|2|all}, POST /unlock?relay={1|2|all}, POST /timer (JSON body {"seconds":N}). Auto-lock duration persists in NVS under the namespace "nexus". Door state is intentionally NOT persisted — a brown-out reboot mid-unlock comes back with both relays driven LOW (locked) before the radio is even started.
ESP32-CAM (AI-Thinker) — a separate fixed-IP device (192.168.4.101) running its own HTTP server. GET /stream returns a multipart/x-mixed-replace; boundary=frame MJPEG at SVGA 800×600 ~25fps; GET /capture returns a single QXGA 2048×1536 JPEG at quality 1. Streaming runs in a FreeRTOS task pinned to core 1 (8KB stack), leaving core 0 free for the WebServer + WiFi stack.
Flutter app — a single LockProvider (ChangeNotifier, ~280 lines) is the only orchestrator. Two services hang off it: Esp32Service (HTTP control plane, returns ServiceResult<T> instead of throwing) and StorageService (a thin shared_preferences wrapper). The MJPEG consumer lives directly in CameraFeedWidget, parsing JPEG SOI/EOI markers out of the raw byte stream.
Transport. Plain HTTP. No TLS. No WebSocket. No MQTT. No bearer token, HMAC, or pre-shared key. CORS is permissive (*). This is a deliberate scope choice: the trust boundary is the AP itself — the device pair lives on a SoftAP-style subnet that's not bridged to the home WiFi or the internet, and the only client expected to talk to it is a phone the owner has paired by typing the IP into a settings screen. Adding HMAC-signed POSTs with a shared secret in NVS is the natural v2 step.
246 lines. One sketch. Locks before WiFi.
A single Arduino sketch — no PlatformIO, no separate translation units. Pin defines, route handlers, state machine, setup(), loop() all in one file. Anyone with the Arduino IDE and the ESP32 board package can flash it.
Polarity-agnostic relay control. A single #define retargets the firmware between active-low opto-isolated relay boards (the typical case) and active-high MOSFET drivers, without touching call sites:
#define RELAY1_PIN 26 // Door 1 magnetic lock #define RELAY2_PIN 27 // Door 2 magnetic lock #define STATUS_LED_PIN 2 // Onboard blue LED #define RELAY_ACTIVE_HIGH false // Active-LOW relay modules #define RELAY_ON (RELAY_ACTIVE_HIGH ? HIGH : LOW) #define RELAY_OFF (RELAY_ACTIVE_HIGH ? LOW : HIGH)
Boot ordering — fail-secure by construction. The most important detail in the firmware: both relays are commanded LOCK before the radio is started. A brown-out reboot mid-unlock can never come back with a door open. Door state is deliberately NOT persisted to NVS — the device always boots with both doors locked.
void setup() {
Serial.begin(115200);
pinMode(RELAY1_PIN, OUTPUT);
pinMode(RELAY2_PIN, OUTPUT);
pinMode(STATUS_LED_PIN, OUTPUT);
relayLock(RELAY1_PIN); // <-- doors locked before WiFi exists
relayLock(RELAY2_PIN);
prefs.begin("nexus", true);
autoLockSeconds = prefs.getInt("timer", 10);
prefs.end();
// ...WiFi.begin() comes only after this...
}Cooperative loop. No xTaskCreate, no semaphores, no FreeRTOS tasks. Three concurrent jobs share loop() via the millis() - last >= interval idiom: HTTP request handling, auto-lock countdown, and a 15-second WiFi watchdog. The only delay() in steady state is a 500ms gap between staggered dual-relay unlock pulses (intentional inrush mitigation).
All timing uses unsigned-arithmetic-wraparound-safe comparisons, so the ~49.7-day millis() rollover is handled correctly. There is no reed switch, no door-position sensor, no buzzer, no keypad, no RFID — the firmware's notion of “locked” is purely the commanded relay state. Closed-loop verification is a v2 hook.
Init large, drop small. Three flags coordinate stream + snapshot.
LIVE
ESP32-CAM @ 192.168.4.101 · SVGA 800×600 · ~25 FPS
The init-large-then-shrink trick. The camera driver allocates PSRAM buffers based on the framesize at esp_camera_init. Initialising at the largest mode the firmware will ever use — QXGA 2048×1536 at JPEG quality 1 — guarantees the buffers fit any subsequent mode change. The firmware then immediately drops the sensor to streaming mode (SVGA 800×600 at quality 12). Switching to QXGA on /capture later doesn't have to reallocate. No fragmentation, no contiguous-PSRAM gambling.
Streaming protocol. Plain multipart/x-mixed-replace; boundary=frame MJPEG over HTTP/1.1. The OV2640 hardware-encodes JPEG itself — the ESP32 just shovels bytes from PSRAM to the network, which is why a $7 camera module can stream at 25fps from a 240MHz chip. The streaming task is pinned to core 1 with an 8KB stack, leaving core 0 free for the WebServer:
void streamTask(void* arg) {
WiFiClient* client = (WiFiClient*)arg;
client->print("HTTP/1.1 200 OK\r\nContent-Type: "
"multipart/x-mixed-replace; boundary=frame\r\n\r\n");
_streaming = true;
while (client->connected() && !_stopStream) {
if (_snapPending) { vTaskDelay(10 / portTICK_PERIOD_MS); continue; }
camera_fb_t* fb = esp_camera_fb_get();
if (!fb) break;
// write boundary + headers + frame bytes...
esp_camera_fb_return(fb);
vTaskDelay(STREAM_DELAY / portTICK_PERIOD_MS);
esp_task_wdt_reset(); // feed the watchdog every frame
}
// cleanup, vTaskDelete(NULL)...
}esp_task_wdt_reset() per frame prevents a slow client (a dropped TCP connection that hasn't FIN'd yet) from blocking client->write long enough to trip the watchdog and reboot the chip.
Stream/snap coordination. Three volatile flags and a 2-second timeout, no mutexes: _streaming (set by the streaming task while alive), _stopStream (set by /capture to ask the loop to exit), _snapPending (causes the loop to spin instead of grab while a snapshot is in flight, so the stream resumes immediately afterward without re-spawning the task). The capture handler waits up to 2 seconds for the streaming loop to acknowledge, switches the sensor to QXGA + q=1, settles 300ms, flushes 3 stale frames, grabs one fresh frame, returns it as image/jpeg, then drops the sensor back to SVGA. fb_count = 2 paired with CAMERA_GRAB_LATEST means frames are dropped, never queued — no accumulated latency.
MJPEG decoder, by hand. Optimistic UI. 800ms cooldown.
MJPEG decoder, by hand. CameraFeedWidget opens the stream as http.Request('GET').send() and parses raw JPEG markers out of the byte chunks — it deliberately ignores the multipart/x-mixed-replace boundary headers entirely. A 500KB runaway-buffer guard trims to the last 200KB if no frame is found, so a corrupt stream can't OOM the app. A 66ms _frameInterval throttles display to ~15fps regardless of incoming rate, keeping setState calls cheap. Image.memory(_currentFrame!, gaplessPlayback: true) is the critical render call — without gaplessPlayback: true, Flutter would flash a blank frame between bytes.
List<int> buf = [];
_streamSub = res.stream.listen((chunk) {
buf.addAll(chunk);
int start = -1;
for (int i = 0; i < buf.length - 1; i++) {
if (buf[i] == 0xFF && buf[i+1] == 0xD8) start = i; // JPEG SOI
if (start != -1 && buf[i] == 0xFF && buf[i+1] == 0xD9) { // JPEG EOI
final frame = Uint8List.fromList(buf.sublist(start, i + 2));
buf = buf.sublist(i + 2);
// throttle to ~15 fps + setState if mounted
break;
}
}
if (buf.length > 500000) buf = buf.sublist(buf.length - 200000); // OOM guard
});The snapshot dance. The ESP32-CAM cannot stream and capture high-res simultaneously (shared DMA buffer), so a snapshot is six steps with empirically-tuned delays: disconnect the MJPEG stream, wait 300ms; turn the LED flash on, wait 200ms; GET /capture with a 20-second timeout — firmware switches to QXGA internally; turn the flash off; write the bytes to disk in the snapshots folder with a millisecond-stamped filename; wait 500ms, then re-open the stream. The 300/200/500ms delays are empirical — the kind of timings you only land on after a few rounds of “why is my snapshot half green and half correct.”
Optimistic UI corrected by polling. On a successful HTTP response, the provider sets door.state = DoorState.locked immediately rather than waiting for the next 2-second poll. The poll is the eventual reconciler — if the relay didn't actually click, the next poll corrects the UI visibly. This is what makes the controls feel snappy on a 2-second polling cadence without ever lying about hardware state. Every action is gated by an 800ms refractory period — hardware relay debounce expressed at the application layer, protecting against double-tap from the user AND from the voice assistant firing two actions in quick succession.
Auto-lock as a visible 1-second-tick countdown. Rather than Timer(Duration(seconds: N), …), the provider uses Timer.periodic(Duration(seconds: 1)) and decrements _autoLockCountdown on every tick, calling notifyListeners() so the status banner renders ⏱ 7s … 6s … 5s …. Sacrifices the cleaner fire-once timer for tighter UX feedback.
The lock is the system. The AI is icing.
Status: staged, mid-integration. Maggy lives in a sibling maggy raw/ folder, not yet relocated into lib/services/. A real state-of-the-repo finding worth disclosing rather than papering over.
Speech I/O is OS-native, not on-device ML. STT is package:speech_to_text — a thin wrapper over Android SpeechRecognizer / iOS SFSpeechRecognizer. TTS is package:flutter_tts. No Whisper, no Vosk, no tflite. Reasoning happens in the cloud via Grok-3 (https://api.x.ai/v1/chat/completions), with temperature: 0.85, max_tokens: 1024, 15-second timeout. Hinglish is handled at the prompt layer, not the speech layer.
Offline keyword fallback — the resilience layer. When Grok is unreachable, a regex/keyword detector still understands the lock vocabulary and dispatches the right relay:
String _detectOfflineLockIntent(String lower) {
if (lower.contains('open') || lower.contains('khol') || lower.contains('unlock')) {
if (lower.contains('top') || lower.contains('1') || lower.contains('ek')) return 'unlock_door1';
if (lower.contains('bottom') || lower.contains('2') || lower.contains('do')) return 'unlock_door2';
return 'unlock_all';
}
// ... mirror for lock/close/band ...
}The principle: the lock is the system, the AI is icing — never let the icing break the cake.
Five-layer persistent memory engine. All on shared_preferences. No vector DB, no embeddings, no LangChain. Eight keys: today_log, today_date, episodes, notes, profile, recent_history, activity_log, stats. When today_log exceeds 50 message exchanges, the oldest 50 are folded into a new Episode with auto-extracted keywords + quotes flagged “notable.” Time-proximity scoring weights episodes within ±2 days of a target +3, within ±7 days +1. ~40 lines of regex + scoring instead of a vector DB. Works because the dataset is one conversation.
Brand mid-rename
pubspec.yamlsaysMagLock_Protocol; the Android manifest launcher label saysNEXUS LOCK; the README still uses the oldnexus_lock/folder structure. The repo shows the seams of an in-progress rename.Hardcoded WiFi credentials in firmware
Both
lock_controller.inoandcam_firmware.inocarry literals; they must be redacted in any quoted snippet. The v2 path is provisioning via NVS or a one-time SoftAP captive portal.No TLS, no auth on the lock REST endpoints
Anyone reachable on the closed subnet can
curl -X POST .../unlock?relay=all. Deliberate — the trust boundary is the AP. The natural v2 step is HMAC-signed requests with a shared secret in NVS.Default app passcode 1234
Settings-configurable, but it ships as a placeholder and gates only the UI, not the network protocol.
Only Android is realistically tested
iOS / macOS / Windows / Linux / Web platform builds are stock
flutter createscaffolds. No Podfile for iOS, noNSMicrophoneUsageDescription, no ATS exception for HTTP-to-LAN-IP.Web is architecturally non-viable
A LAN-control app cannot run in a browser: the ESP32 doesn't send CORS headers; an HTTPS-hosted build hits mixed-content blocking on HTTP-to-LAN-IP requests. The case study's “I learned the browser's security model says no” beat.
Tests are zero-meaningful
test/widget_test.dartis the unmodifiedflutter createcounter smoke test. The case study should not claim test coverage.
800ms
relay-fire cooldown
~15fps
MJPEG display throttle
500/200KB
buffer guard / trim
2000ms
status poll interval
3000ms
stream reconnect backoff
20s
GET /capture timeout
2048×1536
QXGA snapshot @ q=1
800×600
SVGA stream @ ~25fps
8KB
FreeRTOS streaming stack
246 / 297
lines (lock fw / cam fw)
~280
lines in LockProvider
49.7 days
millis() rollover safe