วันจันทร์ที่ 25 มีนาคม พ.ศ. 2567

วันพุธที่ 13 มีนาคม พ.ศ. 2567

Encountered problems during the uses of well-known LLM services

ChatGPT and Gemini generated wrong python codes but insisted on correction. So humans are still needed to detect any hallucination.

They are having the legal cases on copyrighted contents used in model training e.g. HarryPotter and Newspapers.

They are actually not only large language model (LLM) but also ML as they can do clustering and prediction, for example.

วันอังคารที่ 12 มีนาคม พ.ศ. 2567

Create a correlation Matrix using Python

 https://www.geeksforgeeks.org/create-a-correlation-matrix-using-python/

hyper parameter vs model parameter

 Model parameters constitute models to encode data patterns while hyperparameters control how the models are training. The latter are set manually to tune the model training.

Model in K-means

  • Model parameters are the cluster centroids/means.
  • Model outputs are the cluster assignments for each data point.

วันศุกร์ที่ 23 กุมภาพันธ์ พ.ศ. 2567

Why does some field has lots of publications a year?

Because their contributions (e.g. in bioscience field) come from new data sets and analyzed by existing methods. Unlike computer science field, the contributions come from new algorithms and proved by existing data sets.

วันพฤหัสบดีที่ 22 กุมภาพันธ์ พ.ศ. 2567

Ensemble learning

 Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Example in CNN : https://towardsdatascience.com/ensembling-convnets-using-keras-237d429157eb

Ensemble architecture:
Decision functions:

The key components of ensemble learning include:

  1. Base Learners (Base Models): These are the individual models that comprise the ensemble. They can be of any type, such as decision trees, neural networks, support vector machines, or any other machine learning algorithm.

  2. Ensemble Methods: These are the techniques used to combine the predictions of the base learners. Some common ensemble methods include:

    • Voting: Combining predictions by majority voting (for classification) or averaging (for regression).
    • Bagging (Bootstrap Aggregating): Training multiple base learners on different subsets of the training data, usually sampled with replacement, and then combining their predictions.
    • Boosting: Building a sequence of base learners where each subsequent learner focuses on the examples that previous learners found difficult, giving higher weight to misclassified instances.
    • Stacking: Training a meta-model (or blender) on the predictions of multiple base learners to make the final prediction.
  3. Diversity: Ensuring that the base learners are diverse, meaning they make different types of errors on the data. This diversity is crucial for the ensemble to outperform individual models. It can be achieved through using different algorithms, different subsets of the data, or different hyperparameters.

  4. Aggregation Strategy: This determines how the predictions of the base learners are combined to produce the final output. Common aggregation strategies include averaging, weighted averaging, or selecting the most frequent prediction.

    Majority Voting: For classification tasks, each base learner's prediction is considered as a "vote," and the final prediction is determined by the majority of votes. This is particularly effective when the base learners have similar performance.
    Weighted Voting: Each base learner's prediction is weighted based on its confidence or performance, and the final prediction is a weighted sum or average of these predictions.

    Averaging:

Simple Average: The predictions of all base learners are averaged to produce the final prediction. This is commonly used in regression tasks.
    Weighted Average: Similar to weighted voting, but the weights are assigned based on the performance or confidence of each base learner.

    Stacking (Meta-Learning):

Base learners' predictions are used as features to train a higher-level model (meta-model or blender). The meta-model learns how to best combine the predictions of base learners to make the final prediction. This approach can capture more complex relationships between the base learners' predictions.

    Bagging (Bootstrap Aggregating):

Base learners are trained on different subsets of the training data, typically sampled with replacement. The final prediction is often the average (for regression) or majority vote (for classification) of the predictions of all base learners. Random Forest is a popular example of a bagging ensemble method using decision trees as base learners.

    Boosting:

Base learners are trained sequentially, with each subsequent learner focusing on the examples that previous learners found difficult. The final prediction is a weighted sum of the predictions of all base learners. Gradient Boosting Machines (GBMs), AdaBoost, and XGBoost are examples of boosting algorithms.

    Rank Aggregation:

In tasks such as recommender systems or search engines, where the goal is to rank items, rank aggregation methods are used to combine the rankings produced by different algorithms into a single ranking that best represents the preferences of the users.
    Evaluation Metric: The metric used to evaluate the performance of the ensemble. Depending on the task (classification, regression, etc.), different evaluation metrics such as accuracy, precision, recall, F1-score, mean squared error (MSE), etc., can be used.
  1. Hyperparameters: Ensemble methods often have hyperparameters that need to be tuned for optimal performance. These may include the number of base learners, learning rates (for boosting algorithms), maximum tree depth (for decision tree-based methods), etc.

วันศุกร์ที่ 16 กุมภาพันธ์ พ.ศ. 2567

Arduino Board VS ESP32 VS Node MCU VS Raspberry Pi

https://v89infinity.com/%E0%B8%84%E0%B8%A7%E0%B8%B2%E0%B8%A1%E0%B9%81%E0%B8%95%E0%B8%81%E0%B8%95%E0%B9%88%E0%B8%B2%E0%B8%87%E0%B8%A3%E0%B8%B0%E0%B8%AB%E0%B8%A7%E0%B9%88%E0%B8%B2%E0%B8%87-arduino-board-vs-node-mcu-vs-raspberr/


All are microcontroller boards except Nodemcu that is an open-source firmware and development kit based on esp8266&32.


วันพฤหัสบดีที่ 15 กุมภาพันธ์ พ.ศ. 2567

7billion parameter language model

 https://www.scb10x.com/blog/typhoon-innovative-thai-language-model?fbclid=IwAR3MrkVOJ2VqDpds7OKY58X6v0B71ogf9mWMCOu4Azj8Ch0wm5eyxERmE1A

ต่อยอดมาจาก Mistral 7B

https://mistral.ai/news/announcing-mistral-7b/


วันพุธที่ 14 กุมภาพันธ์ พ.ศ. 2567

Programming language popularity ranking

 https://spectrum.ieee.org/the-top-programming-languages-2023


Methodology of survey 

https://spectrum.ieee.org/top-programming-languages-methodology


วันอังคารที่ 13 กุมภาพันธ์ พ.ศ. 2567

Astrophotography

Planetary imaging vs deep sky imaging vs skyscape photography e.g. milky way

hallucination

 noun: (artificial intelligence) A confident but incorrect response given by an artificial intelligence.

วันจันทร์ที่ 12 กุมภาพันธ์ พ.ศ. 2567

วันพฤหัสบดีที่ 8 กุมภาพันธ์ พ.ศ. 2567

Mean mode mediam norm

 1. ค่าเฉลี่ย (Mean) คือ ผลรวมทั้งหมดหารด้วยจำนวนข้อมูล

2. มัธยฐาน (Median) คือ การนำข้อมูลมาเรียงจากน้อยไปมากและเลือกเอาค่าข้อมูลที่อยู่ตรงกลางแถว

3. ฐานนิยม (Mode) คือ ค่าข้อมูลที่ซ้ำกันมากที่สุด

Norm เป็นค่ากลางที่เราเลือกใช้เพื่อเปรียบเทียบกับคะแนนของผู้สอบ Norm จะเป็นค่าอะไรก็ได้ที่เราคิดว่าเหมาะสมที่สุด อาทิ ค่าเฉลี่ย คือ เอาคะแนนของทุกคนมารวมกันแล้วหารด้วยจำนวนผู้สอบ ค่ามัธยฐาน คือ เอาคะแนนของทุกคนมาเรียงกันจากน้อยไปมาก หรือจากมากไปน้อย แล้วใช้คะแนนของตำแหน่งที่อยู่ตรงกลาง

วันอังคารที่ 30 มกราคม พ.ศ. 2567

Create custom GPT

 https://zapier.com/blog/custom-chatgpt/

https://openai.com/blog/introducing-gpts

Create chatbot with chatgpt:

https://www.freecodecamp.org/news/how-to-create-a-chatbot-with-the-chatgpt-api/

https://thaikeras.com/community/main-forum/chatbot/


ChatGPT vs Bard

 https://zapier.com/blog/chatgpt-vs-bard/