Senior AI/ML Engineer
Mô tả công việc
Model Development & Optimization
· Fine- tune and adapt state- of- the- art OCR/document models (Donut) for production use.
· Maintain and enhance existing AI models for OCR on Vietnamese ID cards (CCCD) and extend to other document types (passports, driver licenses, bank documents).
· Optimize training and inference pipelines for performance, scalability, and cost efficiency.
Data Pipeline & Quality Management
· Manage large datasets combining synthetic and real- world document images.
· Build preprocessing and augmentation pipelines: image quality checks, blur/rotation detection, Vietnamese text normalization, PII masking.
· Ensure data quality and evaluation consistency across multiple document types.
Accuracy & Performance Evaluation
· Analyze failed predictions (e.g., accents, truncated fields, misrecognized entities) and integrate findings into retraining cycles.
· Implement image/document quality control to prevent poor inputs from degrading OCR accuracy.
· Define and monitor evaluation metrics: character/word accuracy, exact match rate, edit distance, latency.
Production & Monitoring
· Investigate and resolve production failures, manage rollbacks, and improve system robustness.
· Deploy, monitor, and maintain OCR models serving production workloads (100k+ documents/month).
· Collaborate with backend engineers to integrate OCR APIs with downstream systems.
Collaboration & Leadership
· Document experiments, model updates, and operational practices.
· Mentor junior engineers in computer vision and OCR best practices.
· Contribute to the long- term roadmap for Document AI, beyond ID cards, to support broader fintech/eKYC and document processing needs.
Yêu cầu công việc
Must- have
· Practical experience in OCR or Computer Vision (e.g., image preprocessing, OpenCV).
· 3+ years of AI/ML engineering experience with Python and PyTorch.
· Experience scaling machine learning services for high traffic.
· Experience deploying ML models into production environments.
· Familiarity with deep learning model training and fine- tuning, preferably with HuggingFace Transformers or OCR frameworks (PaddleOCR, Tesseract).
· Experience with Vietnamese text processing (accents, tokenization, normalization).
· Knowledge of Linux, Docker, and Git.
Nice- to- have
· Background in fintech/eKYC or handling sensitive/PII data.
· Knowledge of MLOps tools (Weights & Biases, MLflow, DVC).
· Model optimization skills: quantization, distillation, ONNX/TensorRT.
Soft Skills
· Strong ownership mindset: accountable for the full lifecycle of OCR models.
· Collaborative attitude: work closely with backend, product, and QA teams.
· Problem- solving ability: capable of debugging training and inference issues.
· Communication skills: explain ML concepts and findings to technical and non- technical stakeholders.
Tech Stack
· OpenCV, PIL
· Python, PyTorch, HuggingFace Transformers, PaddleOCR
· MLflow / Weights & Biases (nice- to- have)
· Git, DVC (optional)
· Docker, Linux
Quyền lợi
Laptop, Chế độ bảo hiểm, Du Lịch, Phụ cấp, Du lịch nước ngoài, Đồng phục, Chế độ thưởng, Chăm sóc sức khỏe, Đào tạo, Tăng lương, Công tác phí, Nghỉ phép năm, CLB thể thao
Cập nhật gần nhất lúc: 2026-01-25 14:05:03












