Recent Notes

Note Create At
Sliding Window Attention November 10, 2025
Multi-Head Latent Attention November 10, 2025
Why do we scale attention weights? November 10, 2025
Pre-Fill in LLM November 10, 2025
When less data is better than more? November 08, 2025
GPU Computation for LLM November 08, 2025
Recent Notes November 08, 2025
Named Entity Recognition (NER) November 08, 2025
Why do we use Projection in QKV? November 08, 2025
DistilBERT November 08, 2025
Why Trigonometric Function for Positional Encoding? November 07, 2025
Instruction Fine Tuning November 07, 2025
Pre-Training LLM November 07, 2025
Auto Regressive Model November 07, 2025
Additive Attention November 07, 2025
Optimizing Transformer November 07, 2025
Multi-Head Attention November 07, 2025
Transformer vs LSTM November 07, 2025
Cross-Attention November 07, 2025
Encoder-Decoder Transformer November 07, 2025
Masked Self-Attention November 07, 2025
Positional Encoding in Transformer November 07, 2025
Linear Regression with Normal Equation November 02, 2025
Multi-Query Attention October 31, 2025
Group-Query Attention October 31, 2025
Paged KV Cache October 31, 2025
KV Cache October 30, 2025
Self-Attention October 30, 2025
Balanced Accuracy October 26, 2025
Fake News Challenge October 25, 2025
Claim Verification Datasets October 25, 2025
Math Dataset October 02, 2025
How to do research September 15, 2025
GPT-OSS September 15, 2025
How to Write Academic Paper (from CS Perspective) September 15, 2025
Presentation Making Tips September 15, 2025
Research Skills Unsorted List September 15, 2025
Toward RL Learning September 15, 2025
vector September 15, 2025
unit-vector September 15, 2025
trace-operator September 15, 2025
spacy-pattern September 15, 2025
spacy-pos September 15, 2025
spacy-pipeline September 15, 2025
spacy-semantic-similarity September 15, 2025
spacy-syntactic-dependency September 15, 2025
spacy-matcher September 15, 2025
spacy-named-entities September 15, 2025
spacy-explanation-of-labels September 15, 2025
spacy-operator-quantifier September 15, 2025
p-value September 15, 2025
spacy-doc-object September 15, 2025
scalar September 15, 2025
spacy-doc-span-token September 15, 2025
norm September 15, 2025
matplotlib legend September 15, 2025
orthogonal-matrix September 15, 2025
orthonormal-vector September 15, 2025
max-norm September 15, 2025
jupyter-notebook-on-server September 15, 2025
logarithm September 15, 2025
matplotlib functions September 15, 2025
lp-norm September 15, 2025
frobenius-norm September 15, 2025
fully-independent-join-distribution September 15, 2025
fully-joint-joint-distribution September 15, 2025
identity-matrix September 15, 2025
joint-distribuition September 15, 2025
determinant September 15, 2025
diagonal-matrix September 15, 2025
doing-literature-review September 15, 2025
bisect_left vs. bisect_right September 15, 2025
conditionally-independent-joint-distribution September 15, 2025
Word Tokenizer September 15, 2025
Word Embeddings September 15, 2025
Word2Vec Embedding September 15, 2025
WordPiece Tokenization September 15, 2025
Variance September 15, 2025
Unsupervised Learning September 15, 2025
Type 1 Error vs. Type 2 Error September 15, 2025
Undersampling September 15, 2025
Undercomplete Autoencoder September 15, 2025
Triplet Loss September 15, 2025
True Positive Rate September 15, 2025
True Negative Rate September 15, 2025
Tokenizer September 15, 2025
Top-K in Retrieval System September 15, 2025
Training a Deep Neural Network September 15, 2025
Three Way Partioning September 15, 2025
Time Complexity of ML Algos September 15, 2025
Time Complexity of ML Models September 15, 2025
Swallow vs. Deep Learning September 15, 2025
Tanh September 15, 2025
TF-IDF September 15, 2025
Sub-sampling in Word2Vec September 15, 2025
Supervised Learning September 15, 2025
Sub-word Tokenizer September 15, 2025
Support Vector Machine (SVM) September 15, 2025
Support Vector September 15, 2025
Surprise September 15, 2025
Stochastic Gradient Descent (SGD) September 15, 2025
Stochastic Gradient Descent with Momentum September 15, 2025
Stride in CNN September 15, 2025
Stump September 15, 2025
Stop Words September 15, 2025
Standard deviation September 15, 2025
Stacking or Meta Model in Ensemble Learning September 15, 2025
Splitting tree in Decision Tree September 15, 2025
Standardization September 15, 2025
Statistical Significance September 15, 2025
Softsign September 15, 2025
Some Common Behavioral Questions September 15, 2025
Softplus September 15, 2025
Sources of Uncertainty September 15, 2025
Specificity September 15, 2025
Simple Linear Regression September 15, 2025
Skip Gram Model September 15, 2025
Soft Margin in SVM September 15, 2025
Softmax September 15, 2025
Sequence-to-Sequence Model September 15, 2025
Sensitivity September 15, 2025
SentencePiece Tokenization September 15, 2025
SVC September 15, 2025
Saddle Points September 15, 2025
Second Order Derivative or Hessian Matrix September 15, 2025
Semi-supervised Learning September 15, 2025
Self-Supervised Learning September 15, 2025
Reno Talk @UMBC on Scale-2024 September 15, 2025
Root Mean Squared Logarithmic Error (RMSLE) September 15, 2025
Retrieval Metrics September 15, 2025
Root Mean Squared Error (RMSE) September 15, 2025
Recall@K September 15, 2025
Regularization September 15, 2025
Reinforcement Learning from Human Feedback (RLHF) September 15, 2025
Reinforcement Learning September 15, 2025
Recall September 15, 2025
Random Forest September 15, 2025
ReLU September 15, 2025
Random Variable September 15, 2025
RNN September 15, 2025
ROUGE-N Score September 15, 2025
ROUGE-LSUM Score September 15, 2025
ROC Curve September 15, 2025
ROUGE-L Score September 15, 2025
RTE (Recognizing Textual Entailment) September 15, 2025
Quotient Rule or Differentiation of Division September 15, 2025
Quintile or Percentile September 15, 2025
R-squared Value September 15, 2025
RMSProp September 15, 2025
PyTorch Loss Functions September 15, 2025
Questions to ask in a Interview? September 15, 2025
PyTorch Refresher September 15, 2025
Probability Density Function September 15, 2025
Probability Distribution September 15, 2025
Probability Mass Function September 15, 2025
Probability vs. Likelihood September 15, 2025
Problem Solving Algorithm Selection September 15, 2025
Prepare for Talk September 15, 2025
Prior Probability September 15, 2025
Principal Component Analysis (PCA) September 15, 2025
Population September 15, 2025
Posterior Probability September 15, 2025
Precision@K September 15, 2025
Precision September 15, 2025
Precision Recall Curve (PRC) September 15, 2025
Perplexity September 15, 2025
Polynomial Regression September 15, 2025
Pooling September 15, 2025
Parameter vs. Hyperparameter September 15, 2025
Perceptron September 15, 2025
Padding in CNN September 15, 2025
Overcomplete Autoencoder September 15, 2025
Oversampling September 15, 2025
PCA vs. Autoencoder September 15, 2025
One Class Gaussian September 15, 2025
One Hot Vector September 15, 2025
One vs One Multi Class Classification September 15, 2025
One vs Rest or One vs All Multi Class Classification September 15, 2025
One Class Classification September 15, 2025
Normalization September 15, 2025
Odds Ratio September 15, 2025
Odds September 15, 2025
Null Hypothesis September 15, 2025
Neural Network Normalization September 15, 2025
Normal Distribution September 15, 2025
Neural Network September 15, 2025
Nesterov Accelerated Gradient (NAG) September 15, 2025
Naive Bayes September 15, 2025
N-gram Method September 15, 2025
Negative Log Likelihood September 15, 2025
Negative Sampling September 15, 2025
Multivariate Normal Distribution September 15, 2025
Mutual Information September 15, 2025
Multiset September 15, 2025
Multivariable Linear Regression September 15, 2025
Multivariate Linear Regression September 15, 2025
Multi Layer Perceptron September 15, 2025
Multi Label Cross Entropy September 15, 2025
Mode September 15, 2025
Model Based vs. Instance Based Learning September 15, 2025
Multi Class Cross Entropy September 15, 2025
Merge Overlapping Intervals September 15, 2025
Merge K-sorted List September 15, 2025
Mini Batch SGD September 15, 2025
Meteor Score September 15, 2025
Min Max Normalization September 15, 2025
Mean Absolute Percentage Error (MAPE) September 15, 2025
Mean Reciprocal Rank (MRR) September 15, 2025
Mean Squared Error (MSE) September 15, 2025
Mean September 15, 2025
Mean Squared Logarithmic Error (MSLE) September 15, 2025
Median September 15, 2025
Matrices September 15, 2025
Maximal Margin Classifier September 15, 2025
Maximum Likelihood September 15, 2025
Mean Absolute Error (MAE) September 15, 2025
Margin in SVM September 15, 2025
Marginal Probability September 15, 2025
Machine Learning Algorithm Selection September 15, 2025
Machine Learning vs. Deep Learning September 15, 2025
Majority vote in Ensemble Learning September 15, 2025
Logistic Regression vs. Neural Network September 15, 2025
ML Case Study or ML Design September 15, 2025
Logistic Regression September 15, 2025
Local Minima September 15, 2025
Log (Odds Ratio) September 15, 2025
Log (Odds) September 15, 2025
Log Scale September 15, 2025
Log-cosh Loss September 15, 2025
Linear Regression September 15, 2025
Local Attention September 15, 2025
Line Equation September 15, 2025
Likelihood September 15, 2025
Label Encoding September 15, 2025
Leaky ReLU September 15, 2025
Layer Normalization September 15, 2025
LLM GPU Calculate September 15, 2025
L1 vs. L2 Regression September 15, 2025
L2 or Ridge Regression September 15, 2025
LSTM September 15, 2025
K-nearest Neighbor (KNN) September 15, 2025
Kernel Regression September 15, 2025
KL Divergence September 15, 2025
Kernel in SVM September 15, 2025
L1 or Lasso Regression September 15, 2025
Jaccard Distance September 15, 2025
Jaccard Similarity September 15, 2025
K-means vs. Hierarchical September 15, 2025
K Fold Cross Validation September 15, 2025
K-means Clustering September 15, 2025
Intrinsic Evaluation September 15, 2025
Interview Resources September 15, 2025
Independent Variable September 15, 2025
Instructional Websites September 15, 2025
Integration by Parts or Integration of Product September 15, 2025
How to prepare for Behavioral Interview September 15, 2025
Hyperparameters September 15, 2025
Huber Loss September 15, 2025
Homonym or Polysemy September 15, 2025
How to Choose Kernel in SVM September 15, 2025
How to combine in Ensemble Learning September 15, 2025
Heapq (nlargest or nsmalles) September 15, 2025
Hierarchical Clustering September 15, 2025
Histogram September 15, 2025
Gumbel Softmax September 15, 2025
Handling Imbalanced Dataset September 15, 2025
Greedy Decoding September 15, 2025
Grid Search Hyperparameter Finding September 15, 2025
Group Normalization September 15, 2025
Gradient Clipping September 15, 2025
Gradient Descent September 15, 2025
Gradient September 15, 2025
Gradient Boost (Classification) September 15, 2025
Gradient Boost (Regression) September 15, 2025
Gradient Boosting September 15, 2025
Gini Impurity September 15, 2025
GloVe Embedding September 15, 2025
Global Minima September 15, 2025
Gaussian Distribution September 15, 2025
GRU September 15, 2025
Genetic Algorithm Hyperparameter Finding September 15, 2025
Fine Tuning Large Language Models September 15, 2025
Forward Feature Selection September 15, 2025
Foundation Model September 15, 2025
GBM September 15, 2025
Feature Extraction September 15, 2025
Feature Preprocessing September 15, 2025
Feature Hashing September 15, 2025
Finding Co-relation between two data or distribution September 15, 2025
False Negative Error September 15, 2025
Feature Engineering September 15, 2025
False Positive Rate September 15, 2025
FastText Embedding September 15, 2025
Extrinsic Evaluation September 15, 2025
F-Beta@K September 15, 2025
F-Beta Score September 15, 2025
Exponential Distribution September 15, 2025
F1 Score September 15, 2025
Expected Value September 15, 2025
Expected Value for Discrete Events September 15, 2025
Exploding Gradient September 15, 2025
Euclidian Norm September 15, 2025
Estimated Mean September 15, 2025
Estimated Standard Deviation September 15, 2025
Estimated Variance September 15, 2025
Entropy September 15, 2025
Essential Visualizations September 15, 2025
Elastic Net Regression September 15, 2025
Ensemble Learning September 15, 2025
Entropy and Information Gain September 15, 2025
Encoder Only Transformer September 15, 2025
Dying ReLU September 15, 2025
Dynamic Programming (DP) in python September 15, 2025
Eigendecomposition September 15, 2025
ELMo Embeddings September 15, 2025
Discriminative vs. Generative Models September 15, 2025
Domain vs. Codomain vs. Range September 15, 2025
Dropout September 15, 2025
Discrete Random Variable September 15, 2025
Derivative September 15, 2025
Differentiation of Product September 15, 2025
Differentiation September 15, 2025
Density Sparse Data September 15, 2025
Dependent Variable September 15, 2025
Decision Tree September 15, 2025
Decoder Only Transformer September 15, 2025
Decision Boundary September 15, 2025
Decision Tree (Regression) September 15, 2025
Decision Tree (Classification) September 15, 2025
Data Imputation September 15, 2025
Data Augmentation September 15, 2025
Crossed Feature September 15, 2025
Curse of Dimensionality September 15, 2025
DBScan Clustering September 15, 2025
Cosine Similarity September 15, 2025
Cross Entropy September 15, 2025
Cross Validation September 15, 2025
Count based Word Embeddings September 15, 2025
Continuous Random Variable September 15, 2025
Contrastive Learning September 15, 2025
Convex vs Nonconvex Function September 15, 2025
Contrastive Loss September 15, 2025
Conditional Probability September 15, 2025
Confusion Matrix September 15, 2025
Continuous Bag of Words September 15, 2025
Contextualized Word Embeddings September 15, 2025
Co-occurrence based Word Embeddings September 15, 2025
Collinearity September 15, 2025
Chain Rule September 15, 2025
Character Tokenizer September 15, 2025
Co-Variance September 15, 2025
Challenges of NLP (2022) September 15, 2025
Causal Language Modeling September 15, 2025
Causality vs. Correlation September 15, 2025
CNN September 15, 2025
Boosting September 15, 2025
Byte Level BPE September 15, 2025
Byte Pair Encoding (BPE) September 15, 2025
Binary Cross Entropy September 15, 2025
Binning or Bucketing September 15, 2025
Binomial Distribution September 15, 2025
Behavioral Interview September 15, 2025
Bidirectional RNN or LSTM September 15, 2025
Bias & Variance September 15, 2025
Batch Normalization September 15, 2025
Bayes Theorem September 15, 2025
Beam Search September 15, 2025
Bag of Words September 15, 2025
Bagging September 15, 2025
Basics of Kubernetes September 15, 2025
BLEU Score September 15, 2025
Backward Feature Elimination September 15, 2025
BERT Embeddings September 15, 2025
Averaging in Ensemble Learning September 15, 2025
BERT September 15, 2025
Area Under Precision Recall Curve (AUPRC) September 15, 2025
Autoencoder for Denoising Images September 15, 2025
Adjusted R-squared Value September 15, 2025
Amazon Leadership Principles September 15, 2025
Alternative Hypothesis September 15, 2025
Adaboost September 15, 2025
Adam September 15, 2025
AdaGrad September 15, 2025
AdaDelta September 15, 2025
Activation Function September 15, 2025
Accuracy September 15, 2025
AUC Score September 15, 2025
3 key question in data visualization September 15, 2025
What is More Likely to Happen Next September 15, 2025
Semantic Product Search for Matching Structured Product Catalogs in E-Commerce September 15, 2025
Vision Language Model-based Caption Evaluation Method Leveraging Visual Context Extraction September 15, 2025
Token Assorted - Mixing Latent and Text Tokens for Improved Language Model Reasoning September 15, 2025
Scientific Fact-Checking - A Survey of Resources and Approaches September 15, 2025
PubMedQA - A Dataset for Biomedical Research Question Answering September 15, 2025
OpenPI-C September 15, 2025
Piecing It All Together - Verifying Multi-Hop Multimodal Claims September 15, 2025
MultiVENT September 15, 2025
MM-LLMs September 15, 2025
Molmo and PixMo September 15, 2025
Is a Question Decomposition Unit All We Need September 15, 2025
Large Language Models are Zero-Shot Rankers for Recommender Systems September 15, 2025
Investigating Continual Pretraining in Large Language Models - Insights and Implications September 15, 2025
G-Eval - NLG Evaluation using GPT-4 with Better Human Alignment September 15, 2025
DeepSeek-R1 September 15, 2025
Deliberative Alignment - Reasoning Enables Safer Language Models September 15, 2025
Compressed Chain of Thought - Efficient Reasoning Through Dense Representations September 15, 2025
τ-bench - A Benchmark for Tool-Agent-User Interaction in Real-World Domains September 15, 2025
COIN September 15, 2025
How to Read a Paper September 15, 2025
ML Interview September 15, 2025
How To Write a Paper September 15, 2025
Deep Learning by Ian Goodfellow September 15, 2025
How To 100M Learning Text Video September 15, 2025
Advanced NLP with Scipy September 15, 2025
Home September 15, 2025