วันศุกร์ที่ 29 พฤศจิกายน พ.ศ. 2567

Recent Text classification algorithms

 Deep Learning-Based Approaches

  • Transformer-Based Models:
    • BERT (Bidirectional Encoder Representations from Transformers)
    • RoBERTa (Robustly Optimized BERT Pretraining Approach) 
    • XLNet
    • GPT-3
    • DistilBERT
  • Recurrent Neural Networks (RNNs):
    • Long Short-Term Memory (LSTM), BiLSTM
    • Gated Recurrent Unit (GRU)
  • Convolutional Neural Networks (CNNs):
    • CNN
    • TextCNN
  • Traditional Machine Learning

    1. Naïve Bayes (NB): Probabilistic; effective for high-dimensional text.
    2. Support Vector Machines (SVM): Strong for sparse data; uses margins to separate classes.
    3. Logistic Regression: Simple and interpretable for binary/multi-class tasks.
    4. k-Nearest Neighbors (k-NN): Uses proximity; expensive for large datasets.
    5. Random Forests: Ensemble-based; reduces overfitting.

วันพฤหัสบดีที่ 28 พฤศจิกายน พ.ศ. 2567

Web Cache Communication Protocol (WCCP)

Use WCCP to redirect traffic from routers or Layer 4 switches to multiple Squid servers, improving scalability.

Deploy multiple Squid servers and distribute traffic across them using a load balancer like HAProxy, NGINX, or hardware appliances.

Sniffer and performance tools

 https://www.linuxlinks.com/best-free-open-source-network-analyzers/

Ttcp (https://netref.soe.ucsc.edu/node/31)

Iperf (https://iperf.fr/)

วันอังคารที่ 19 พฤศจิกายน พ.ศ. 2567

Information Gain

Information Gain measures how much information entropy is reduced after splitting the dataset on an attribute. It helps identify the attribute that provides the most information about the target class.



 is H is information entropy. D is data set.

Key Idea: The larger the reduction in entropy after the split, the greater the Information Gain. Attributes with higher Information Gain are preferred for splitting.



Applications in Decision Trees: At each node, the algorithm selects the attribute with the highest Information Gain to split the dataset.


วันพฤหัสบดีที่ 14 พฤศจิกายน พ.ศ. 2567

Types of regression models

There are many types of regression models, each suited to different types of data and relationships. Some common types include:

1. Linear Regression: Models the relationship between two variables by fitting a straight line.

2. Logistic Regression: Used for binary classification, predicting probabilities for categories (e.g., yes/no, 0/1).

3. Polynomial Regression: Extends linear regression to model nonlinear relationships by using polynomial functions.

4. Ridge Regression: A type of linear regression that includes a regularization term to prevent overfitting.

5. Lasso Regression: Similar to ridge regression, but it can reduce some coefficients to zero, effectively selecting features.

6. Elastic Net Regression: Combines ridge and lasso regression for a balance between feature selection and regularization.

7. Quantile Regression: Estimates the median or other quantiles of the response variable, not just the mean.

8. Poisson Regression: Used for count data, modeling how often an event happens.

9. Ordinal Regression: Models ordinal (ranked) outcomes, where categories have an order but no specific distance between them.

10. Multinomial Logistic Regression: Extends logistic regression for multiclass classification problems.

11. Bayesian Regression: Applies Bayesian principles to linear regression for probabilistic prediction.

12. Support Vector Regression (SVR): A type of regression that uses support vector machine concepts for both linear and nonlinear relationships.