Machine Learning Guide
Machine learning audio course, teaching the fundamentals of machine learning and artificial intelligence. It covers intuition, models (shallow and deep), math, languages, frameworks, etc. Where your other ML resources provide the trees, I provide the forest. Consider MLG your syllabus, with highly-curated resources for each episode's details at ocdevel.com. Audio is a great supplement during exercise, commute, chores, etc.
info_outline
MLA 024 Code AI MCP Servers, ML Engineering
04/13/2025
MLA 024 Code AI MCP Servers, ML Engineering
Links Notes and resources at stay healthy & sharp while you learn & code audio/video editing with AI power-tools Tool Use in AI Code Agents File Operations: Agents can read, edit, and search files using sophisticated regular expressions. Executable Commands: They can recommend and perform installations like pip or npm installs, with user approval. Browser Integration: Allows agents to perform actions and verify outcomes through browser interactions. Model Context Protocol (MCP) Standardization: MCP was created by Anthropic to standardize how AI tools and agents communicate with each other and with external tools. Implementation: MCP Client: Converts AI agent requests into structured commands. MCP Server: Executes commands and sends structured responses back to the client. Local and Cloud Frameworks: Local (S-T-D-I-O MCP): Examples include utilizing Playwright for local browser automation and connecting to local databases like Postgres. Cloud (SSE MCP): SaaS providers offer cloud-hosted MCPs to enhance external integrations. Expanding AI Capabilities with MCP Servers Directories: Various directories exist listing MCP servers for diverse functions beyond programming. Use Cases: Automation Beyond Coding: Implementing MCPs that extend automation into non-programming tasks like sales, marketing, or personal project management. Creative Solutions: Encourages innovation in automating routine tasks by integrating diverse MCP functionalities. AI Tools in Machine Learning Automating ML Process: Auto ML and Feature Engineering: AI tools assist in transforming raw data, optimizing hyperparameters, and inventing new ML solutions. Pipeline Construction and Deployment: Facilitates the use of infrastructure as code for deploying ML models efficiently. Active Experimentation: Jupyter Integration Challenges: While integrations are possible, they often lag and may not support the latest models. Practical Strategies: Suggests alternating between Jupyter and traditional Python files to maximize tool efficiency. Action Plan for ML Engineers: Setup structured folders and documentation to leverage AI tools effectively. Encourage systematic exploration of MCPs to enhance both direct programming tasks and associated workflows.
/episode/index/show/machinelearningguide/id/36113315
info_outline
MLA 023 Code AI Models & Modes
04/13/2025
MLA 023 Code AI Models & Modes
Links Notes and resources at stay healthy & sharp while you learn & code audio/video editing with AI power-tools Model Current Leaders According to the (as of April 12, 2025), leading models include for vibe-coding: Gemini 2.5 Pro Preview 03-25: most accurate and cost-effective option currently. Claude 3.7 Sonnet: Performs well in both architect and code modes with enabled reasoning flags. DeepSeek R1 with Claude 3.5 Sonnet: A popular combination for its balance of cost and performance between reasoning and non-reasoning tasks. Local Models Tools for Local Models: is the standard tool to manage local models, enabling usage without internet connectivity. Privacy and Security: Utilizing local models enhances data security, suitable for sensitive projects or corporate environments that require data to remain onsite. Performance Trade-offs: Local models, due to distillation and size constraints, often perform slightly worse than cloud-hosted models but offer privacy benefits. Fine-Tuning Models Customization: Developers can fine-tune pre-trained models to specialize them for their specific codebase, enhancing relevance and accuracy. Advanced Usage: Suitable for long-term projects, fine-tuning helps models understand unique aspects of a project, resulting in consistent code quality improvements. Tips and Best Practices Judicious Use of the @ Key: Improves model efficiency by specifying the context of commands, reducing the necessity for AI-initiated searches. Examples include specifying file paths, URLs, or git commits to inform AI actions more precisely. Concurrent Feature Implementation: Leverage tools like to manage multiple features simultaneously, acting more as a manager overseeing several tasks at once, enhancing productivity. Continued Learning: Staying updated with documentation, particularly 's, due to its comprehensive feature set and versatility among AI coding tools.
/episode/index/show/machinelearningguide/id/36113275
info_outline
MLA 022 Code AI Tools
02/09/2025
MLA 022 Code AI Tools
Links Notes and resources at stay healthy & sharp while you learn & code audio/video editing with AI power-tools I currently favor Roo Code. Plus either gemini-2.5-pro-exp-03-25 for Architect, Boomerang, or Code with large contexts. And Claude 3.7 for code with small contexts, eg Boomerang subtasks. Many others favor Cursor, Aider, or Cline. Copilot and Windsurf are less vogue lately. I found Copilot to struggle more; and their pricing - previously their winning point - is less compelling now. Why I favor Roo. The default settings have it as stable and effective as Cline, Cursor. But you can tinker more with these settings - eg, for Gemini 2.5 I disable partial file reads (since it has a huge context window). Their modes are elegantly just custom system prompts (an oversimplification), making custom workflows very powerful. A potent example is their Boomerang Mode, which is an orchestrator that delegates planning and edit subtasks, to keep context windows tight. Boomerang mode specifically is a plugin-seller, it's incredibly powerful. Aider is still a darn decent exacto-knife, but as Roo has grown, I haven't found much need for Aider. Tools discussed: Other: "Vibe coding" using AI agents in software development. It uses LLMs for code generation and project management. Developers are increasingly relying on agentic tools and IDE plugins to improve productivity. Use of AI in Code Generation AI tools facilitate the generation and editing of code. Integration typically occurs within IDEs or as plugins. These tools offer features like inline editing, bug fixing, and project scaffolding. Evolution and Adoption The concept is gaining popularity due to its efficiency and competitive edge in development. Popular AI Tools for Vibe Coding Cursor Characteristics: Most popular, stable, with advanced agentic capabilities. Pricing: $20 per month, additional charges for power-use. Strengths: Reliable, focuses on integrating new models effectively. Windsurf Characteristics: Cost-effective, a VS Code fork. Pricing: Starts at $15, with higher usage at $60. Strengths: Similar to Cursor, with a competitive pricing model. GitHub Copilot Characteristics: Operates within GitHub code spaces, developed by Microsoft. Pricing: $10 to $40 monthly. Strengths: Deep integration with cloud-based development environments. Cline Characteristics: Open-source, known for customizable features. Pricing: BYOM (Bring Your Own Model), costs based on individual API usage. Strengths: Community-driven, rapid development cycles. Roo Code Characteristics: Fast-moving, offers the latest technological advancements. Pricing: Uses BYOM model, similar to Cline. Strengths: Frequent updates, for users wanting cutting-edge features. Aider Characteristics: CLI-based, focuses on precision and minimal token usage. Pricing: BYOM, efficient token usage strategies. Strengths: High accuracy for small adjustments, good for backup use. Choosing the Right Tool Beginner Recommendation: Start with Cursor for reliability. Experimentation: Try Copilot and Windsurf for comparisons. Advanced Configuration: Use Kline or Roo Code for sophisticated tasks and ER for precise adjustments. Cost Management Open Router: Centralize API billing to manage interactions across multiple models, preventing fragmented payments.
/episode/index/show/machinelearningguide/id/35212505
info_outline
MLG 033 Transformers
02/09/2025
MLG 033 Transformers
Links: Notes and resources at 3Blue1Brown videos: stay healthy & sharp while you learn & code audio/video editing with AI power-tools Background & Motivation RNN Limitations: Sequential processing prevents full parallelization—even with attention tweaks—making them inefficient on modern hardware. Breakthrough: “Attention Is All You Need” replaced recurrence with self-attention, unlocking massive parallelism and scalability. Core Architecture Layer Stack: Consists of alternating self-attention and feed-forward (MLP) layers, each wrapped in residual connections and layer normalization. Positional Encodings: Since self-attention is permutation invariant, add sinusoidal or learned positional embeddings to inject sequence order. Self-Attention Mechanism Q, K, V Explained: Query (Q): The representation of the token seeking contextual info. Key (K): The representation of tokens being compared against. Value (V): The information to be aggregated based on the attention scores. Multi-Head Attention: Splits Q, K, V into multiple “heads” to capture diverse relationships and nuances across different subspaces. Dot-Product & Scaling: Computes similarity between Q and K (scaled to avoid large gradients), then applies softmax to weigh V accordingly. Masking Causal Masking: In autoregressive models, prevents a token from “seeing” future tokens, ensuring proper generation. Padding Masks: Ignore padded (non-informative) parts of sequences to maintain meaningful attention distributions. Feed-Forward Networks (MLPs) Transformation & Storage: Post-attention MLPs apply non-linear transformations; many argue they’re where the “facts” or learned knowledge really get stored. Depth & Expressivity: Their layered nature deepens the model’s capacity to represent complex patterns. Residual Connections & Normalization Residual Links: Crucial for gradient flow in deep architectures, preventing vanishing/exploding gradients. Layer Normalization: Stabilizes training by normalizing across features, enhancing convergence. Scalability & Efficiency Considerations Parallelization Advantage: Entire architecture is designed to exploit modern parallel hardware, a huge win over RNNs. Complexity Trade-offs: Self-attention’s quadratic complexity with sequence length remains a challenge; spurred innovations like sparse or linearized attention. Training Paradigms & Emergent Properties Pretraining & Fine-Tuning: Massive self-supervised pretraining on diverse data, followed by task-specific fine-tuning, is the norm. Emergent Behavior: With scale comes abilities like in-context learning and few-shot adaptation, aspects that are still being unpacked. Interpretability & Knowledge Distribution Distributed Representation: “Facts” aren’t stored in a single layer but are embedded throughout both attention heads and MLP layers. Debate on Attention: While some see attention weights as interpretable, a growing view is that real “knowledge” is diffused across the network’s parameters.
/episode/index/show/machinelearningguide/id/35206875
info_outline
MLA 021 Databricks
06/22/2022
MLA 021 Databricks
to stay healthy while you study or work! Full notes at Raybeam and Databricks: Ming Chang from Raybeam discusses Raybeam's focus on data science and analytics, and how their recent acquisition by Dept Agency has expanded their scope into ML Ops and AI. Raybeam often utilizes Databricks due to its comprehensive nature. Understanding Databricks: Contrary to initial assumptions, Databricks is not just an analytics platform like Tableau but an ML Ops platform competing with tools like SageMaker and Kubeflow. It offers functionalities for creating notebooks, executing Python code, and using a hosted Spark cluster and Delta Lake for data storage. Choosing the Right MLOps Tool: Depending on client requirements, Raybeam might recommend different tools. Decision factors include client's existing expertise, infrastructure needs, and scaling challenges. Databricks is often recommended for its ease of use and features. Databricks Features: Offers a hosted solution for Spark clusters on AWS, Azure, or GCP; integrates with IDEs like VSCode through Databricks Connect; provides a unique Git integration for version control of notebooks; and utilizes Delta Lake for version control of Parquet files, enhancing operations like edit and delete. Parquet and Delta Lake: Parquet files are optimized for big data, and Delta Lake provides transaction-like operations over Parquet by maintaining version history. Pricing and Usage: Databricks adds a nominal fee on top of cloud provider charges. It's accessible for single developers and startups, making it suitable for various scales of operations. Ming Chang's Picks: Discusses interests in automated stock trading projects and building drones with Raspberry Pi, highlighting the intersection of programming and physical computing. Additional Resources For a hands-on look at Ming Chang's drone project, follow his developments or connect for insights on building a Raspberry Pi-powered drone.
/episode/index/show/machinelearningguide/id/23502782
info_outline
MLA 020 Kubeflow
01/29/2022
MLA 020 Kubeflow
to stay healthy while you study or work! Full notes at Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) - Data Scientist at Dept Agency . (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. Our goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow. . If using TensorFlow with Kubeflow, combine with TFX for maximum power. (From the website:) TensorFlow Extended (TFX) is an end-to-end platform for deploying production ML pipelines. When you're ready to move your models from research to production, use TFX to create and manage a production pipeline. Alternatives:
/episode/index/show/machinelearningguide/id/21939530
info_outline
MLA 019 DevOps
01/13/2022
MLA 019 DevOps
to stay healthy while you study or work! Full notes at Chatting with co-workers about the role of DevOps in a machine learning engineer's life Expert coworkers at Dept - Principal Software Developer - DevOps Lead (where Matt features often) Devops tools Pictures (funny and serious)
/episode/index/show/machinelearningguide/id/21770120
info_outline
MLA 017 AWS Local Development
11/06/2021
MLA 017 AWS Local Development
to stay healthy while you study or work! Show notes: Developing on AWS first (SageMaker or other) Consider developing against AWS as your local development environment, rather than only your cloud deployment environment. Solutions: Stick to AWS Cloud IDEs (, , Connect to deployed infrastructure via Infrastructure as Code
/episode/index/show/machinelearningguide/id/21070127
info_outline
MLA 016 SageMaker 2
11/05/2021
MLA 016 SageMaker 2
to stay healthy while you study or work! Full note at Part 2 of deploying your ML models to the cloud with SageMaker (MLOps) MLOps is deploying your ML models to the cloud. See for an overview of tooling (also generally a great ML educational run-down.)
/episode/index/show/machinelearningguide/id/21059909
info_outline
MLA 015 SageMaker 1
11/04/2021
MLA 015 SageMaker 1
to stay healthy while you study or work! Part 1 of deploying your ML models to the cloud with SageMaker (MLOps) MLOps is deploying your ML models to the cloud. See for an overview of tooling (also generally a great ML educational run-down.) And I forgot to mention , I'll mention next time.
/episode/index/show/machinelearningguide/id/21048182
info_outline
MLA 014 Machine Learning Server
01/18/2021
MLA 014 Machine Learning Server
to stay healthy while you study or work! Full notes at Server-side ML. Training & hosting for inference, with a goal towards serverless. AWS SageMaker, Batch, Lambda, EFS, Cortex.dev
/episode/index/show/machinelearningguide/id/17581607
info_outline
MLA 013 Customer Facing Tech Stack
01/03/2021
MLA 013 Customer Facing Tech Stack
to stay healthy while you study or work! Full notes at Client, server, database, etc.
/episode/index/show/machinelearningguide/id/17400590
info_outline
MLA 012 Docker
11/09/2020
MLA 012 Docker
to stay healthy while you study or work! Full notes at Use Docker for env setup on localhost & cloud deployment, instead of pyenv / Anaconda. I recommend Windows for your desktop.
/episode/index/show/machinelearningguide/id/16726955
info_outline
MLG 032 Cartesian Similarity Metrics
11/08/2020
MLG 032 Cartesian Similarity Metrics
to stay healthy while you study or work! Show notes at . L1/L2 norm, Manhattan, Euclidean, cosine distances, dot product Normed distances A norm is a function that assigns a strictly positive length to each vector in a vector space. Minkowski is generalized. p_root(sum(xi-yi)^p). "p" = ? (1, 2, ..) for below. L1: Manhattan/city-block/taxicab. abs(x2-x1)+abs(y2-y1). Grid-like distance (triangle legs). Preferred for high-dim space. L2: Euclidean. sqrt((x2-x1)^2+(y2-y1)^2. sqrt(dot-product). Straight-line distance; min distance (Pythagorean triangle edge) Others: Mahalanobis, Chebyshev (p=inf), etc Dot product A type of inner product. Outer-product: lies outside the involved planes. Inner-product: dot product lies inside the planes/axes involved . Dot product: inner product on a finite dimensional Euclidean space Cosine (normalized dot)
/episode/index/show/machinelearningguide/id/16722518
info_outline
MLA 011 Practical Clustering
11/08/2020
MLA 011 Practical Clustering
to stay healthy while you study or work! Full notes at Kmeans (sklearn vs FAISS), finding n_clusters via inertia/silhouette, Agglomorative, DBSCAN/HDBSCAN
/episode/index/show/machinelearningguide/id/16725809
info_outline
MLA 010 NLP packages: transformers, spaCy, Gensim, NLTK
10/28/2020
MLA 010 NLP packages: transformers, spaCy, Gensim, NLTK
to stay healthy while you study or work! Full note at NLTK: swiss army knife. Gensim: LDA topic modeling, n-grams. spaCy: linguistics. transformers: high-level business NLP tasks.
/episode/index/show/machinelearningguide/id/16621373
info_outline
MLA 009 Charting tools
11/06/2018
MLA 009 Charting tools
to stay healthy while you study or work! Full notes at matplotlib, Seaborn, Bokeh, D3, Tableau, Power BI, QlikView, Excel
/episode/index/show/machinelearningguide/id/16622930
info_outline
MLA 008 Exploratory Data Analysis
10/26/2018
MLA 008 Exploratory Data Analysis
to stay healthy while you study or work! Full notes at EDA + charting. DataFrame info/describe, imputing strategies. Useful charts like histograms and correlation matrices.
/episode/index/show/machinelearningguide/id/16622954
info_outline
MLA 007 Jupyter Notebooks
10/16/2018
MLA 007 Jupyter Notebooks
to stay healthy while you study or work! Full notes at Run your code + visualizations in the browser: iPython / Jupyter Notebooks.
/episode/index/show/machinelearningguide/id/16622969
info_outline
MLA 006 Salary
07/19/2018
MLA 006 Salary
to stay healthy while you study or work! Full notes at Salary based on location, gender, age, tech... from O'Reilly.
/episode/index/show/machinelearningguide/id/16622978
info_outline
MLA 005 Shapes & Sizes
06/09/2018
MLA 005 Shapes & Sizes
to stay healthy while you study or work! Full notes at Dimensions, size, and shape of Numpy ndarrays / TensorFlow tensors, and methods for transforming those.
/episode/index/show/machinelearningguide/id/16622984
info_outline
MLA 003 Storage: HDF, Pickle, Postgres
05/24/2018
MLA 003 Storage: HDF, Pickle, Postgres
to stay healthy while you study or work! Full notes at Comparison of different data storage options when working with your ML models.
/episode/index/show/machinelearningguide/id/16622999
info_outline
MLA 002 Numpy & Pandas
05/24/2018
MLA 002 Numpy & Pandas
to stay healthy while you study or work! Full notes at Some numerical data nitty-gritty in Python.
/episode/index/show/machinelearningguide/id/16623014
info_outline
MLA 001 Certificates & Degrees
05/24/2018
MLA 001 Certificates & Degrees
to stay healthy while you study or work! Full notes at Reboot on the MLG episode, with more confident recommends.
/episode/index/show/machinelearningguide/id/16623032
info_outline
MLG 029 Reinforcement Learning Intro
02/05/2018
MLG 029 Reinforcement Learning Intro
to stay healthy while you study or work! Introduction to reinforcement learning concepts. for notes and resources.
/episode/index/show/machinelearningguide/id/6226276
info_outline
MLG 028 Hyperparameters 2
02/04/2018
MLG 028 Hyperparameters 2
Notes and resources: to stay healthy while you study or work! More hyperparameters for optimizing neural networks. A focus on regularization, optimizers, feature scaling, and hyperparameter search methods. Hyperparameter Search Techniques Grid Search involves testing all possible permutations of hyperparameters, but is computationally exhaustive and suited for simpler, less time-consuming models. Random Search selects random combinations of hyperparameters, potentially saving time while potentially missing the optimal solution. Bayesian Optimization employs machine learning to continuously update and hone in on efficient hyperparameter combinations, avoiding the exhaustive or random nature of grid and random searches. Regularization in Neural Networks L1 and L2 Regularization penalize certain parameter configurations to prevent model overfitting; often smoothing overfitted parameters. Dropout randomly deactivates neurons during training to ensure the model doesn’t over-rely on specific neurons, fostering better generalization. Optimizers Optimizers like Adam, which combines elements of momentum and adaptive learning rates, are explained as vital tools for refining the learning process of neural networks. Adam, being the most sophisticated and commonly used optimizer, improves upon simpler techniques like momentum by incorporating more advanced adaptative features. Initializers The importance of weight initialization is underscored with methods like uniform random initialization and the more advanced Xavier initialization to prevent neural networks from starting in 'stuck' states. Feature Scaling Different scaling methods such as standardization and normalization are used to scale feature inputs to small, standardized ranges. Batch Normalization is highlighted, integrating scaling directly into the network to prevent issues like exploding and vanishing gradients through the normalization of layer outputs. Links
/episode/index/show/machinelearningguide/id/6222761
info_outline
MLG 027 Hyperparameters 1
01/28/2018
MLG 027 Hyperparameters 1
Full notes and resources at to stay healthy while you study or work! Hyperparameters are crucial elements in the configuration of machine learning models. Unlike parameters, which are learned by the model during training, hyperparameters are set by humans before the learning process begins. They are the knobs and dials that humans can control to influence the training and performance of machine learning models. Definition and Importance Hyperparameters differ from parameters like theta in linear and logistic regression, which are learned weights. They are choices made by humans, such as the type of model, number of neurons in a layer, or the model architecture. These choices can have significant effects on the model's performance, making them vital to conscious and informed tuning. Types of Hyperparameters Model Selection: Choosing what model to use is itself a hyperparameter. For example, deciding between linear regression, logistic regression, naive Bayes, or neural networks. Architecture of Neural Networks: Number of Layers and Neurons: Deciding the width (number of neurons) and depth (number of layers). Types of Layers: Whether to use LSTMs, convolutional layers, or dense layers. Activation Functions: They transform linear outputs into non-linear outputs. Popular choices include ReLU, tanh, and sigmoid, with ReLU being the default for most neural network layers. Regularization and Optimization: These influence the learning process. The use of L1/L2 regularization or dropout, as well as the type of optimizer (e.g., Adam, Adagrad), are hyperparameters. Optimization Techniques Techniques like grid search, random search, and Bayesian optimization are used to systematically explore combinations of hyperparameters to find the best configuration for a given task. While these methods can be computationally expensive, they are necessary for achieving optimal model performance. Challenges and Future Directions The field strives towards simplifying the choice of hyperparameters, ideally automating them to become parameters of the model itself. Efforts like Google's AutoML aim to handle hyperparameter tuning automatically. Understanding and optimizing hyperparameters is a cornerstone in machine learning, directly impacting the effectiveness and efficiency of a model. Progress continues to integrate these choices into model training, reducing the dependency on human intervention and trial-and-error experimentation. Decision Tree Model selection Unsupervised? K-means Clustering => DL Linear? Linear regression, logistic regression Simple? Naive Bayes, Decision Tree (Random Forest, Gradient Boosting) Little data? Boosting Lots of data, complex situation? Deep learning Network Layer arch Vision? CNN Time? LSTM Other? MLP Trading LSTM => CNN decision Layer size design (funnel, etc) Face pics From BTC episode Don't know? Layers=1, Neurons=mean(inputs, output) Output Sigmoid = predict probability of output, usually at output Softmax = multi-class Nothing = regression Relu family (Leaky Relu, Elu, Selu, ...) = vanishing gradient (gradient is constant), performance, usually better Tanh = classification between two classes, mean 0 important
/episode/index/show/machinelearningguide/id/6195814
info_outline
MLG 026 Project Bitcoin Trader
01/27/2018
MLG 026 Project Bitcoin Trader
to stay healthy while you study or work! Ful notes and resources at NOTE. This episode is no longer relevant, and tforce_btc_trader no longer maintained. The current podcast project is Gnothi. Episode Overview Project: Trading Crypto Special: Intuitively highlights decisions: hypers, supervised v reinforcement, LSTM v CNN Crypto (v stock) Bitcoin, Ethereum, Litecoin, Ripple Many benefits (immutable permenant distributed ledger; security; low fees; international; etc) For our purposes: popular, volatile, singular Singular like Forex vs Stock (instruments) Trading basics Day, swing, investing Patterns (technical analysis, vs fundamentals) OHLCV / Candles Indicators Exchanges & Arbitrage (GDAX, Krakken) Good because highlights lots LSTM v CNN Supervised v Reinforcement Obvious net architectures (indicators, time-series, tanh v relu) Episode Summary The project "Bitcoin Trader" involves developing a Bitcoin trading bot using machine learning to capitalize on the hot topic of cryptocurrency and its potential profitability. The project will serve as a medium to delve into complex machine learning engineering topics, such as hyperparameter selection and reinforcement learning, over subsequent episodes. Cryptocurrency, specifically Bitcoin, is used for its universal and decentralized nature, akin to a digital, secure, and democratic financial instrument like the US dollar. Bitcoin mining involves running complex calculations to manage the currency's existence, similar to a distributed Federal Reserve system, with transactions recorded on a secure and permanent ledger known as the blockchain. The flexibility of cryptocurrency trading allows for machine learning applications across unsupervised, supervised, and reinforcement learning paradigms. This project will focus on using models such as LSTM recurrent neural networks and convolutional neural networks, highlighting Bitcoin’s unique capacity to illustrate machine learning concept decisions like network architecture. Trading differs from investing by focusing on profit from price fluctuations rather than a belief in long-term value increase. It involves understanding patterns in price actions to buy low and sell high. Different types of trading include day trading, which involves daily buying and selling, and swing trading, which spans longer periods. Trading decisions rely on patterns identified in price graphs, using time series data. Data representation through candlesticks (OHLCV: open-high-low-close-volume), coupled with indicators like moving averages and RSI, provide multiple input features for machine learning models, enhancing prediction accuracy. Exchanges like GDAX and Kraken serve as platforms for converting traditional currencies into cryptocurrencies. The efficient market hypothesis suggests that the value of an instrument is fairly priced based on the collective analysis of market participants. Differences in exchange prices can provide opportunities for arbitrage, further fueling trading strategies. The project code, currently using deep reinforcement learning via tensor force, employs convolutional neural networks over LSTM to adapt to Bitcoin trading's intricacies. The project will be available at ocdevel.com for community engagement, with future episodes tackling hyperparameter selection and deep reinforcement learning techniques.
/episode/index/show/machinelearningguide/id/6194090
info_outline
MLG 025 Convolutional Neural Networks
10/30/2017
MLG 025 Convolutional Neural Networks
to stay healthy while you study or work! Notes and resources at Filters and Feature Maps: Filters are small matrices used to detect visual features from an input image by applying them to local pixel patches, creating a 3D output called a feature map. Each filter is tasked with recognizing a specific pattern (e.g., edges, textures) in the input images. Convolutional Layers: The filter is applied across the image to produce an output which is the feature map. A convolutional layer is composed of several feature maps, with depth corresponding to the number of filters applied. Image Compression Techniques: Window and Stride: The window is the size of the pixel patch examined by the filter, and stride determines how much the window moves over the image. Together, they allow compression of images by reducing the number of windows examined, effectively downsampling the image. Padding: Padding allows the filter to account for border pixels that do not fit perfectly within the window size. 'Same' padding adds zero-padding to ensure all pixels are included, while 'valid' padding ignores excess pixels around the borders. Max Pooling: Max pooling is a downsampling technique used to reduce the spatial dimensions of feature maps by taking the maximum value over a defined window, further compressing and reducing computational load. Predefined Architectures: There are well-established predefined architectures like LeNet, AlexNet, and ResNet, which have been fine-tuned through competitions such as the ImageNet Challenge, and can be used directly or adapted for specific tasks in computer vision.
/episode/index/show/machinelearningguide/id/5890712
info_outline
MLG 024 Tech Stack
10/07/2017
MLG 024 Tech Stack
to stay healthy while you study or work! Notes and resources at Hardware Desktop if you're stationary, as you'll get the best performance bang-for-buck and improved longevity; laptop if you're mobile. Desktops. Build your own PC, better value than pre-built. See , make sure to use an Nvidia graphics card. Generally shoot for 2nd-best of CPUs/GPUs. Eg, RTX 4070 currently (2024-01); better value-to-price than 4080+. For laptops, see . OS / Software Use Linux (I prefer Ubuntu), or Windows, WSL2, and Docker. See for details. Programming Tech Stack Deep-learning frameworks. You'll use both TF & PT eventually, so don't get hung up. for details. Tensorflow (and/or Keras) PyTorch (and/or Lightning) Shallow-learning / utilities: ScikitLearn, Pandas, Numpy Cloud-hosting: AWS / GCP / Azure. for details. Episode Summary The episode discusses setting up a tech stack tailored for machine learning, emphasizing the necessity of choosing a primary programming language and framework, which, in this case, are Python and TensorFlow. The decision is supported by the ongoing popularity and community support for these tools. This preference is further influenced by the necessity for GPU optimization, which TensorFlow provides, allowing for enhanced performance through utilizing Nvidia's CUDA technology. A notable change in the landscape is the decline of certain deep learning frameworks such as Theano, and the rise of competitors like PyTorch, which is gaining traction due to its ease of use in comparison to TensorFlow. The author emphasizes the importance of selecting frameworks with robust community support and resources, highlighting TensorFlow's lead in the market in this respect. For hardware, the suggestion is a custom-built PC with a powerful Nvidia GPU, such as the 1080 TI, running Ubuntu Linux for best compatibility. However, for those who favor cloud services, Amazon Web Services (AWS) and Google Cloud Platform (GCP) are viable options, with a preference for GCP due to cost and performance benefits, particularly with the upcoming Tensor Processing Units (TPUs). On the software side, the use of Pandas for data manipulation, NumPy for mathematical operations, and Scikit-Learn for shallow learning tasks provides a comprehensive toolkit for machine learning development. Additionally, the use of abstraction libraries such as Keras for simplifying TensorFlow syntax and TensorForce for reinforcement learning are recommended. The episode further explores system architectures, suggesting a separation of concerns between a web app server and a machine learning (job) server. Communication between these components can be efficiently managed using a message queuing system like RabbitMQ, with Celery as a potential abstraction layer. To support developers in implementing their machine learning pipelines, the recommendation extends to leveraging existing datasets, using Scikit-Learn for convenient access, and standardizing data for effective training results. The author points to several books and resources to assist in understanding and applying these technologies effectively, ending with your own workstation recommendations and building TensorFlow from source for performance gains as a potential advanced optimization step.
/episode/index/show/machinelearningguide/id/5816352