Hi. I am
Taha Bouhafa
> AI Engineering Student
Exploring intelligent systems at the intersection of vision and data.
I specialize in Computer Vision and Deep Learning, building robust models
that turn visual data into accurate and impactful solutions.
About Me
I’m Taha Bouhafa, an engineering student specializing in Big Data and Artificial Intelligence at ENSA Tétouan. Passionate about AI, I focus on deep learning, natural language processing, and computer vision.
With practical experience in Python, PyTorch, TensorFlow, and scikit-learn, I enjoy applying state-of-the-art techniques including Retrieval-Augmented Generation (RAG) and LangChain to build intelligent, context-aware applications.
I am eager to grow as an AI practitioner, contribute to impactful projects, and solve real-world problems by combining data, language, and intelligence.
Tech Stack:
Programming & AI:
Tools & Platforms:
Education
ENSA Tétouan, Morocco
Sept 2023 – PresentEngineering Cycle in Big Data and Artificial Intelligence. Courses: Machine Learning, NLP, Deep Learning, Computer Vision, Big Data, Data Visualisation .
ENSA Tétouan, Morocco
Sept 2021 – July 2023Integrated Preparatory Classes. Strong foundations in mathematics, physics, and computer science.
EST Agadir, Morocco
Sept 2020 – July 2021DUT in Computer Science. Focus on programming, algorithms, databases, and software development.
Morocco
Sept 2019 – July 2020Baccalaureate in Mathematical Sciences with honors.
Experience
CRI Tangier-Tétouan-Al Hoceima
June 2025 - September 2025AI Research Internship
During the internship I explored Retrieval‑Augmented Generation (RAG) architectures and evaluated a range of open‑source large language models (Mistral, DeepSeek, LLaMA). I built a complete RAG pipeline with LangChain, integrating vector‑based retrieval (FAISS), keyword‑based retrieval (BM25), and contextual generation through a FastAPI backend. I also created a responsive web front‑end (HTML/CSS/JS) that interfaces with the API, delivering an interactive user experience designed to support the information‑retrieval needs of Regional Center of Investments, Tanger‑Tetouan‑Al Hoceima Region (CRI TTAH). The system was evaluated with RAGAS to assess its retrieval and generation performance.
Adventures in Development:
Top Personal Projects
U-Net3+ with Deep Supervision — Aerial Image Segmentation (Potsdam)
Semantic segmentation of ultra-high-resolution aerial RGB images (ISPRS Potsdam) using U-Net3+ with optional deep supervision. Includes tiling pipeline, RGB→class mask conversion, training scripts, metrics (IoU, Dice), and visualization.
Features
U-Net3+ architecture with full-scale skip connections
Optional deep supervision for intermediate decoder outputs
Tiling pipeline for 6000×6000 Potsdam images (256/512 px)
RGB mask → indexed class mask converter
Evaluation with IoU and Dice Score, plus visualizations
Tech Stack: PyTorch, TorchMetrics, NumPy, Pillow, Matplotlib

CivicVision, Pedestrian Crossing Behavior Recognition
AI-based computer vision system for predicting pedestrian crossing intention in real time. Uses a Two-Stream I3D architecture to analyze RGB and optical flow data, enabling risk-aware decision making for intelligent transportation systems and autonomous vehicles.
Features
Two-Stream I3D model using RGB and optical flow inputs
Spatio-temporal behavior analysis including motion and body cues
Real-time crossing intention prediction with risk probability score
Decision engine for trajectory maintenance or emergency braking
Interactive React dashboard with live metrics and visualization
Tech Stack: Python, TensorFlow, Keras, I3D, OpenCV, React, JAAD Dataset

Visual Question Answering (VQA)
Developed a deep learning system that answers natural language questions about images. Combined ResNet50 for image encoding and BERT for text processing to predict answers from the VQA v2.0 dataset.
Features
Integrated BERT and ResNet for multi-modal input processing
Processed COCO-VQA JSON annotations for majority answers
Tokenized questions and created a top-1000 answer vocabulary
Extracted visual features from images using ResNet50
Tech Stack: PyTorch, Transformers, ResNet50, BERT, NumPy, Matplotlib

ODQA, Open Domain Question Answering System
End to end open domain question answering system that retrieves relevant passages from a large document collection and extracts precise answer spans using a LoRA fine tuned BERT reader. Includes a full stack web application for interactive querying.
Features
Dense Passage Retrieval using DPR and FAISS indexing
LoRA adapted BERT reader for efficient answer span extraction
Confidence scoring and top k passage selection
FastAPI backend with question and conversation endpoints
Interactive React web interface with authentication support
Tech Stack: PyTorch, Hugging Face Transformers, FAISS, FastAPI, React, Tailwind CSS

Sign Language Recognition (ASL)
Built a real-time web app for American Sign Language detection using YOLOv8. Users can upload images, videos, or use live webcam stream to detect ASL letters instantly.
Features
YOLOv8-based object detection
Real-time webcam detection with OpenCV
Video conversion using MoviePy
Flask app with multi-page routing (Home, Detect, Stream)
Tech Stack: Python, Flask, YOLOv8, OpenCV, MoviePy, Bootstrap

Semantic Book Recommender
Semantic recommendation system using sentence-transformers and LangChain. Users get book suggestions based on prompts, emotions in descriptions, and filtered categories.
Features
Semantic search with MiniLM embeddings
Emotion-based sorting (joy, fear, sadness, etc.)
Category filtering
Interactive Gradio dashboard
Tech Stack: LangChain, ChromaDb, Gradio, MiniLM, Pandas, NumPy

Skill Radar: AI & Data Job Forecasting
Dashboard and prediction platform providing insights on Data Science & AI job trends, skill demand forecasting, skill recommendations, and salary estimations based on real job data.
Features
NER-based skill extraction using JobBERT
Skill demand forecasting with Prophet
Skill recommendation with Deep Learning
Salary estimation with regression
Streamlit dashboard for full interactivity
Tech Stack: Streamlit, Prophet, TensorFlow, Hugging Face, Scikit-learn, MongoDB

Certifications

Generative AI with Diffusion Models - NVIDIA (Jan 2025)
NVIDIA
Vision Language Models (VLM) Bootcamp - OpenCV University (Nov 2025)
OpenCV University
Intro to Transformer-Based NLP - NVIDIA (Jan 2025)
NVIDIA
Supervised ML: Regression & Classification - Stanford (Oct 2024)
Stanford University
Data Visualization with Python - IBM (May 2024)
IBM
Tools for Data Science - IBM (Apr 2024)
IBM
What is Data Science? - IBM (Apr 2024)
IBM
Exploring Internet of Things with Cisco Packet Tracer - Cisco Networking Academy (Dec 2025)
Cisco Networking Academy