“Can a machine learn to feel the mood of a nation in crisis — from
nothing but its tweets?”
01
The Dataset
1,539
Comments collected from X (Twitter)
2025-2026
Collection period
3 Classes
Sentiment categories
363 Positive
493 Neutral
683 Negative
Manual + AI
The data was labelled with the help of ChatGPT and manually verified to ensure correctness
Curated & Clean
✓ No slang
✓ No code-switching
✓ No typos
✓ Standard Indonesian only
02
The Contenders
CHALLENGER
Bi-LSTM
Bidirectional Long Short-Term Memory
A recurrent neural network that reads each comment word by word —
forwards and backwards — building meaning through sequential memory.
It learns from word embeddings trained on the dataset itself.
ArchitectureRecurrent (RNN)
ContextSequential, both directions
Pre-trainingNone — learns from scratch
CONTENDER
IndoBERT
Transformer for Indonesian Language
A transformer pre-trained on a massive Indonesian text corpus, then
fine-tuned on the labelled comments. It reads the whole comment at
once, weighing every word against every other word.
ArchitectureTransformer
ContextFull-sentence attention
Pre-trainingLarge Indonesian corpus
03
The Method
1
Collect
Scrape crisis-related comments from X
›
2
Clean
Remove noise, normalise text
›
3
Label
Hand-tag 3 sentiment classes
›
4
Tokenize
Embeddings vs WordPiece
›
5
Train
Fit both models, same split
›
6
Evaluate
Compare on the test set
04
Head-to-Head Results
IndoBERT
Bi-LSTM
F1 is the metric to highlight if your classes are imbalanced — crisis
comments usually skew negative, so accuracy alone can be misleading.
05
What Indonesia Felt
SENTIMENT
across all
comments
Based on 1,539 labelled comments
42.6%
Negative · 656 comments
Anxiety over prices, jobs and the rupiah dominated the
conversation.
26%
Neutral · 400 comments
Factual reporting, news links and questions without
clear emotion.
31.4%
Positive · 483 comments
Hope, support for policy responses, or optimism about
recovery.
06
The Verdict
IndoBERT
led on accuracy and F1 — but the better choice depends on what
the project values most.
Pre-training on Indonesian gave the transformer an edge in
understanding context and subtle wording. Yet the lighter Bi-LSTM
trained faster and on far less compute, making it a practical option
when resources are tight.
IndoBERT — Strengths
- Higher accuracy & F1
- Understands context & nuance
- Benefits from pre-training
Bi-LSTM — Strengths
- Faster, cheaper to train
- Lighter to deploy
- Simpler to interpret