1 Introductions and What the Course is About and Prerequisites (64.09 MB) 2 Course Structure (21.05 MB) 1 What's Next (63.78 MB) 1 Development Environment Setup - Overview (27.49 MB) 2 Setup OpenAI API Account and API Key (45.61 MB) 3 Setup the Unstructured Account and FREE API Key (27.25 MB) 4 Unstructured Framework Test Run (41.13 MB) 1 Data Preprocessing Deep Dive - Overview (101.36 MB) 2 Data Preprocessing for LLMs Overview - Why Data Preprocessing is Hard (9.07 MB) 3 Challenges with Unstructured Data (15.64 MB) 4 How Content Extraction Works - Cleaning and Data Normalization (51.27 MB) 5 Chunking and Structuring Data and Workflow Orchestration (132.91 MB) 6 The Unstructured Framework - The Whole Workflow and Overview (139.21 MB) 1 Check in (20.98 MB) 1 Hands-on Preprocessing a PDF File and Dissecting the Extracted JSON Data (113.89 MB) 2 Hands-on Preprocessing a PPTX (PowerPoint) File (69.02 MB) 3 Hands-on Preprocessing an HTML File (21.75 MB) 4 Benefits of Normalizing Content - Summary (62.64 MB) 1 Content Chunking and Metadata Extraction - Overview (96.34 MB) 2 Finding Elements Associated with Chapters - Hands-on (85.82 MB) 3 Semantic Similarity - Hybrid Search and Saving Documents to Vector Database (95.34 MB) 4 Code Restructuring - Avoid Multiple Document Preprocessing (22.81 MB) 5 Semantic Similarity Challenges - Information Recency Criteria (70.46 MB) 6 Chunking for Document Elements and Benefits - Full Overview (146.06 MB) 7 Chunking Document Content - Hands-on (41.69 MB) 8 Summary (18.82 MB) 1 Preprocessing Complex Documents - PDFs and Images - Overview (13.15 MB) 2 Document Image Analysis Methods Document Layout Detector and Visual Transformer (73.08 MB) 3 Advantages and Disadvantages of ViT and DLD (46.27 MB) 4 Preprocessing HTML and PDF files - Fast (44.38 MB) 5 Preprocessing with Document Layout Detection and Comparing the Results (82.09 MB) 6 Table Content Extraction - Hands-on (68.39 MB) 7 Summarizing the Table Data with LangChain - Hands-on (47.74 MB) 1 Put it All Together - Build a RAG System Using What You've Learned - Overview (18.94 MB) 2 Preprocessing a PDF File and Showing Tabular Content as Well - Part 1 (60.25 MB) 3 Filtering out References and Headers from PDF - Part 2 (60.09 MB) 4 Preprocess PPTX & MD File and Save Document Elements to Vector Database Part 3 (72.59 MB) 5 Chat with Your Own Documents - PDF - Part 4 (140.22 MB) 6 Chat with Your Own Documents - MD and PPTX Documents - Final (80.15 MB)