วันศุกร์ที่ 11 ตุลาคม พ.ศ. 2567

PointCloud

In 3D modeling, a point cloud is a set of data points in a 3D coordinate system—commonly known as the XYZ axes.

https://en.wikipedia.org/wiki/Point_cloud

Geospatial DBMS

https://postgis.net/

PostGIS extends the capabilities of the PostgreSQL relational database by adding support for storing, indexing, and querying geospatial data.

Retrieval Augmented Generation (RAG) for GAI

 https://www.gai.in.th/prompting-techniques/retrieval-augmented-generation-rag

วันพุธที่ 9 ตุลาคม พ.ศ. 2567

ChatGPT vs Gemini

 From my experience, ChatGPT is good at linguistic tasks but math while Gemini is opposite.

Forecast error metrics: sMAPE & MASE & NMAE & NMSE for cross data set comparison

  • เศษส่วนเรียกว่า Fraction; เศษ เรียกว่า Numerator; ส่วน เรียกว่า Denominator

Mean Absolute Percentage Error (MAPE)

  • The formula is:

    MAPE=1nt=1n(ytyt^yt)×100\text{MAPE} = \frac{1}{n} \sum_{t=1}^{n} \left( \frac{|y_t - \hat{y_t}|}{|y_t|} \right) \times 100

    Where:

    • yty_t is the actual value at time tt.
    • yt^\hat{y_t} is the forecasted (predicted) value at time tt.
    • nn is the total number of observations.
    • MAPE expresses the error as a percentage of the actual values and may exceed 100 percent.
    • Lower MAPE means better forecasting accuracy.
    • Higher MAPE indicates larger errors.
Symmetric Mean Absolute Percentage Error (sMAPE) 
  • Another popular metric for evaluating the accuracy of forecasting models. It is a modified version of the Mean Absolute Percentage Error (MAPE) that addresses some of MAPE's limitations, particularly when the actual values are close to zero, which can lead to inflated percentage errors in MAPE.
  • The formula for sMAPE is:

    sMAPE=1nt=1nytyt^yt+yt^2×100\text{sMAPE} = \frac{1}{n} \sum_{t=1}^{n} \frac{|y_t - \hat{y_t}|}{\frac{|y_t| + |\hat{y_t}|}{2}} \times 100

    Where:

    • yty_t is the actual value at time tt.
    • yt^\hat{y_t} is the forecasted value at time tt.
    • nn is the number of data points.
  • It is considered symmetric because it treats overestimation and underestimation equally:
    • Absolute Difference: It calculates the absolute difference between the forecast and the actual value for each data point. This ensures that both overestimations (positive differences) and underestimations (negative differences) are treated as positive values.
    • Sum of Actual and Forecast: In the denominator, it sums the absolute values of the actual and forecast values. This ensures that the denominator is always positive and symmetric, regardless of whether the forecast is over or underestimating the actual value.
  • IMHO, it's difficult to interpret sMAPE because its denominator includes errorneous forecast values as well.
Mean Absolute Scaled Error (MASE)
  • MASE is calculated as the ratio of the mean absolute error of the forecasting model to the mean absolute error of a naive forecasting model. A "naive" model, in this context, refers to a model that simply uses the previous value as the forecast value for the next time period (e.g., for time series data) i.e. เอาค่าจริงที่ t-1 มาเป็นค่า forecast ที่ t.

    The formula for MASE is:

    MASE=1nt=1nytyt^1n1t=2nytyt1∣​

    Where:

    • yty_t is the actual observed value at time tt
    • yt^\hat{y_t} is the forecasted value at time tt
    • nn is the number of observations
    • The denominator is the mean absolute error of a naive forecasting algorithm (often the "random walk" method, which uses the previous period’s value as the forecast).

    Interpretation:

    • MASE = 1: The model performs as well as the naive forecast.
    • MASE < 1: The model outperforms the naive forecast (i.e., it produces lower errors).
    • MASE > 1: The model underperforms compared to the naive forecast.
  • Scale independence: MASE can be used for comparison purpose across different datasets with varying scales since MASE itself has no unit.
  • IMHO, MASE aims at comparing the proposed forecasting algorithm with the (in-sample one-step) naive algorithm.
  • NMAE and NMSE are also scale independent but aim at quantifying the forecast errors in relation to the actual values. NMAE and NMSE are just MAE and MSE divided by one of the following normalizers:
    • Range: The difference between the maximum and minimum values of the actual data (max(yt)min(yt)\text{max}(y_t) - \text{min}(y_t)).
    • Mean: The mean of the actual values (yˉ\bar{y}).
    • Median: Sometimes the median of the actual values is used.
Cf. Gemini, ChatGPT, Wikipedia

วันเสาร์ที่ 5 ตุลาคม พ.ศ. 2567

Java's key advantages over Python

 1. Type Safety & Reliability

- Static typing catches errors at compile-time rather than run time error as in Python 

- Stronger type checking

- Less runtime surprises

- Better for large codebases

- Clearer code contracts

2. Performance

- Faster execution speed

- True multithreading (no GIL)

- Better memory management

- More efficient for CPU-intensive tasks

- Better scalability

3. Enterprise Features

- Robust frameworks (Spring)

- Better security features

- Strong enterprise tooling

- Better transaction management

- Industry-standard enterprise solutions

4. Team Development

- Easier to maintain in large teams

- Better code organization

- Clearer package structure

- Enhanced refactoring capabilities

- Better for modular development

5. Tools & IDE Support

- Superior debugging tools

- Better code analysis

- More sophisticated IDEs

- Better profiling tools

- Comprehensive development environments

6. Stability & Maintenance

- Better backward compatibility

- Longer support cycles

- More stable ecosystem

- Clearer deprecation paths

- Better for long-term projects

7. Security

- Built-in security manager

- Better access control

- Enterprise-grade security

- More secure by design

- Better authentication/authorization

8. Documentation & Standards

- More standardized practices

- Better documentation

- Clearer coding standards

- Industry-wide best practices

- More consistent coding patterns

This makes Java particularly suitable for:

- Enterprise applications

- Large-scale systems

- Mission-critical software

- Long-term projects

- Team-based development

While Python excels in:

- Rapid prototyping

- Data science

- Script automation

- Small to medium projects

- Quick development cycles


วันศุกร์ที่ 4 ตุลาคม พ.ศ. 2567

Python is not for enterprise applications.

Because it lacks extensibility and maintainability unlike Java language.

วันจันทร์ที่ 30 กันยายน พ.ศ. 2567

วันอาทิตย์ที่ 29 กันยายน พ.ศ. 2567

Customgpt VS Chatgpt

 CustomGPT utilizes ChatGPT in two primary ways:

  1. Foundation Model: ChatGPT serves as the foundation for CustomGPT models. This means that the underlying architecture and knowledge base of ChatGPT are used as a starting point for training CustomGPT models.
  2. Fine-Tuning: CustomGPT models are further trained on specific datasets relevant to their intended use cases. This process, known as fine-tuning, allows the models to learn and adapt to the unique characteristics of the target domain.

By combining ChatGPT's general-purpose capabilities with fine-tuning on specific datasets, CustomGPT can create highly specialized AI models that are tailored to perform well in various applications.

วันศุกร์ที่ 27 กันยายน พ.ศ. 2567

Neural Net's Learning rate adjustment

 One cycle policy (OCP)  is increasing and decreasing learning rate (i.e. adaptive learning rate) between upper & lower bounds to avoid slow training as much as possible.

https://medium.com/@varunsivamani/one-cycle-policy-a-deep-understanding-6d4d352ec7b1

https://medium.com/dsnet/the-1-cycle-policy-an-experiment-that-vanished-the-struggle-in-training-neural-nets-184417de23b9

วันพฤหัสบดีที่ 26 กันยายน พ.ศ. 2567

SLA Tiers

Level 1 – 7/24/4 

  • 7 days a week, 24 hours a day, reaction within 4 hours 
  • 7×24 Help Desk 
  • 7×24 Monitoring 
  • Initial Phone-Based Diagnosis 
  • Field Engineer on-site (if required) within 4 hours after customer contact 
  • On-Site Spares Kit 
  • Parts Replacement 
  • Certified Experts 

Level 2 – 5/9/4 

  • 8 hours a day, 5 working days per week, reaction within 4 hours 
  • 8am-5pm Help Desk 
  • 8am-5pm Monitoring 
  • Initial Phone-Based Diagnosis 
  • Field Engineer on-site (if required) within 4 hours after customer contact 
  • On-Site Spares Kit 
  • Parts Replacement 
  • Certified Experts 

Level 3 – Next Day 

  • Next business day response/parts 
  • Initial Phone-Based Diagnosis  
  • Field Engineer on-site (if required) the following business day after customer contact 
  • On-Site Spares Kit 
  • Parts Replacement 
  • Certified Experts 

https://www.parkplacetechnologies.com/blog/what-does-24x7x4-support-mean-for-your-business/

วันจันทร์ที่ 23 กันยายน พ.ศ. 2567

Predicting Continuous Values with Transformers

Gemini:

Transformers can be effectively used to predict continuous values. While they were initially designed for natural language processing (NLP) tasks, their ability to capture long-range dependencies and complex patterns has made them versatile for various machine learning applications, including regression problems.

How Transformers Work for Regression

 * Input Encoding: The continuous values are typically encoded into a sequence of numerical representations, often using techniques like:

   * Embedding: For categorical variables

   * Normalization: For numerical variables

 * Positional Encoding: To provide the model with information about the relative position of each element in the sequence, positional encoding is added.

 * Transformer Layers: The input sequence is processed through multiple transformer layers, each consisting of:

   * Self-Attention: Captures relationships between different elements in the sequence.

   * Feed-forward Network: Applies a non-linear transformation to each element.

 * Output Layer: The final layer is a linear layer that maps the output of the transformer layers to a continuous value.

Advantages of Using Transformers for Regression

 * Long-range dependencies: Transformers can capture long-range dependencies, making them suitable for time series forecasting or tasks involving complex relationships between distant elements.

 * Flexibility: They can handle both sequential and non-sequential data, providing flexibility in various applications.

 * State-of-the-art performance: Transformers have achieved state-of-the-art results in many regression tasks, including time series forecasting and medical image analysis.

Key Considerations

 * Data preprocessing: Proper data preprocessing, including normalization and handling missing values, is crucial for optimal performance.

 * Hyperparameter tuning: Experimenting with different hyperparameters, such as the number of layers, attention heads, and hidden dimensions, can help improve results.

 * Computational resources: Training large transformer models can be computationally expensive, requiring significant GPU resources.

In conclusion, Transformers are a powerful tool for predicting continuous values, especially in tasks involving complex relationships and long-range dependencies. By understanding their architecture and key considerations, you can effectively leverage them for your regression problems.

Claude:

Transformers can be used to predict continuous values. This is often referred to as "regression" in machine learning terminology. Here's a brief overview:

1. Output layer modification: Instead of using a softmax layer for classification, you'd use a linear layer to output a single continuous value.

2. Loss function: You'd typically use mean squared error (MSE) or mean absolute error (MAE) as the loss function, rather than cross-entropy used in classification tasks.

3. Architecture: The core Transformer architecture (self-attention, feed-forward layers, etc.) remains largely the same.

4. Applications: This approach has been used successfully in various domains, including:

   - Time series forecasting

   - Financial predictions

   - Scientific modeling (e.g., predicting physical properties)

5. Variants: Some models like Temporal Fusion Transformers (TFT) are specifically designed for time series regression tasks.

Example:

https://medium.com/@Matthew_Frank/stock-price-prediction-using-transformers-2d84341ff213#:~:text=A%20Global%20Average%20Pooling%20layer,values%2C%20such%20as%20stock%20prices.

Benchmark & Baseline

Benchmarking involves measuring and comparing the performance of systems, models, or processes against a standard or across different systems. This process helps to evaluate how well a model or system performs in relation to known standards or to other models. In machine learning, the benchmark is usually set by the performance of the baseline or other leading models.

Benchmarking uses baselines: The baseline serves as the initial point of comparison in benchmarking. When evaluating a model or system, the baseline provides the first performance standard. If a model performs better than the baseline, it’s an indicator that the model has some value, and further benchmarking against other models can help assess its true effectiveness.

Baseline establishes expectations: Without a baseline, benchmarking would lack a clear starting point. By defining what "acceptable" or "expected" performance looks like, the baseline enables meaningful benchmarking comparisons.

Normalized RMSE & Normalized MAE

To compare model performance across different datasets by scaling the error metrics, you can normalize both RMSE and MAE:

Normalized RMSE:

  • using the mean of data setNRMSE=RMSEy¯ (i.e., Coefficient of Variation of RMSE)

  • using the difference between maximum and minimum in data setNRMSE=RMSEymaxymin,

  • using the standard deviation of data setNRMSE=RMSEσ, or

  • using the interquartile range of data setNRMSE=RMSEQ1Q3, i.e. the difference between 25th and 75th percentile, of observations.


https://www.marinedatascience.co/blog/2019/01/07/normalizing-the-rmse/

Normalized MAE: 
  • aka. Coefficient of Variation of MAE; Coefficient of Variation (CV) =  the ratio of the standard deviation  to the mean 
  • using the mean of data setNRMSE=RMSEy¯