Dr.Jiw

วันเสาร์ที่ 19 มีนาคม พ.ศ. 2565

Supplemental info in open access journal e.g.Peerj allows public accesses to paper data sets and codes

วันพฤหัสบดีที่ 10 มีนาคม พ.ศ. 2565

Google meet companion mode

สำหรับใช้โดยคนในห้องประชุมแบบ onsite ที่เข้าร่วมประชุม google meet แบบ hybrid คือมีบางคนประชุมแบบ remote ซึ่งต้องใช้ google meet ใน mode ปกติ

ทำให้ remote participants ได้ยินเสียงคนในห้องประชุมปกติ แต่คนในห้องประชุมจะไม่ได้ยินเสียงต้นที่ผ่านเข้าไปใน google meet companion mode ออกมากจากอุปกรณ์ของผู้เข้าร่วมประชุม onsite คนอื่นๆ ทำให้ไม่ echo (no voice feedback)

วันพฤหัสบดีที่ 3 มีนาคม พ.ศ. 2565

Reinforcement learning

RL learns from interaction rather than labeled data, the core idea of gradually improving performance through experience.

1. Learning Through Trial and Error

The agent tries actions, observes results (state transitions and rewards), and updates its knowledge or policy.
Over time, it learns which actions lead to better outcomes.

2. Parameter Updates

Just like in supervised learning, the model (e.g., Q-table, neural network) has parameters (weights).
During training, these parameters are updated to minimize a loss function (e.g., temporal difference error in Q-learning or prediction loss in DQNs).

3. Exploration vs. Exploitation

In training, the agent often explores new actions (e.g., epsilon-greedy strategy) to improve learning.
In the final (deployment) phase, it mainly exploits the learned policy.

====

There are many algorithms for reinforcement learning, please see https://en.wikipedia.org/wiki/Reinforcement_learning

Well-known algorithm is Q-learning.

Reinforcement learning involves an agent, a set of states $�$ , and a set $�$ of actions per state. By performing an action $a\in A$ , the agent transitions from state to state. Executing an action in a specific state provides the agent with a reward (a numerical score).

Q learning algorithm

After $\Delta t$ steps into the future the agent will decide some next step. The weight for this step is calculated as $\gamma ^{\Delta t}$ , where $\gamma$ (the discount factor) is a number between 0 and 1 ( $0\leq \gamma \leq 1$ ) and has the effect of valuing rewards received earlier higher than those received later (reflecting the value of a "good start"). $\gamma$ may also be interpreted as the probability to succeed (or survive) at every step $\Delta t$ .

The algorithm, therefore, has a function that calculates the quality of a state–action combination:

Q:S\times A\to \mathbb {R}

Before learning begins, $�$ is initialized to a possibly arbitrary fixed value (chosen by the programmer). Then, at each time $�$ the agent selects an action $a_{t}$ , observes a reward $r_{t}$ , enters a new state $s_{t+1}$ (that may depend on both the previous state $s_{t}$ and the selected action), and $�$ is updated. Q is the expected value of G, which is the sum of future discounted rewards:

G=\sum _{t=0}^{\infty }\gamma ^{t}R_{t+1}=R_{1}+\gamma R_{2}+\gamma ^{2}R_{3}+\cdots ,

. The core of the algorithm is a Bellman equation as a simple value iteration update, using the weighted average of the current value and the new information.

Cf. https://en.wikipedia.org/wiki/Q-learning#Deep_Q-learning

Q learning uses Q table to store Q values, representing qualities of rewards the agent can achieve at state s when taking action a. Q table's row represents states and column represents actions; each data item in a table cell represents Q value.

Deep Q-Learning (DQL) uses neural network whose output is an approximated current Q-values instead of using Tabular (Q-table for discrete states/actions) to store current Q-values as in Q learning. In Q-Learning, Bellman equation is directly applied to update the Q-values stored in a table for each state-action pair. In DQN, the neural network is trained to reduce loss function value between target Q value (derived from Bellman equation) and current Q value.

Reward is an immediate feedback whereas Q value is an expected cumulative future rewards.

--ChatGPT

Snake game:

You want the agent (snake) to learn how to survive and grow longer by playing many games.

The environment (game board) provides feedback through rewards (e.g., +1 for eating food, -1 for dying).

You want the AI to develop strategies like avoiding collisions, planning moves, or maximizing score over time.

Neural network used in DQN for Snake game:

Input: a representation of the environment’s state.

Grid / image input → CNN-based DQN
Feature vector input → MLP-based DQN

1.Grid input

Treat the snake game board as a matrix (like an image).

Input:

0 = empty cell

1 = snake body

2 = snake head

3 = food

If the board is 20×20 → the input is 20×20 matrix (sometimes flattened into 400 values).

Neural nets for this usually use CNNs (like in Atari DQN).

2. Features Vector

Simpler and often more efficient. Common features:

[Snake head position,Food position,

Relative position of food,

Snake direction (one-hot: [up, down, left, right]),

Danger information (is there a wall or body in the next cell up/down/left/right?),

Snake length]

Output: estimated Q-values for all possible actions .

A vector of 4 Q values, each for moving up, down, left, right.

วันเสาร์ที่ 26 กุมภาพันธ์ พ.ศ. 2565

Temporal Convolutional Net (TCN)

https://www.datasciencecentral.com/temporal-convolutional-nets-tcns-take-over-from-rnns-for-nlp-pred/

TCN vs RNN vs LSTM vs GRU: https://arxiv.org/pdf/1803.01271.pdf

Vs Transformer (used in LLM) (https://www.sabrepc.com/blog/deep-learning-and-ai/rnns-vs-lstm-vs-transformers) , (https://en.m.wikipedia.org/wiki/Transformer_(deep_learning_architecture)

====ChatGPT:====

What concept does Transformer use instead of loop as in lstm?

Transformers replace the recurrent loops used in LSTM (Long Short-Term Memory) networks with a mechanism called **self-attention** (or attention mechanism) that allows them to process input sequences in parallel, rather than sequentially.

In LSTMs, the input is processed one timestep at a time, which introduces dependencies between time steps. This sequential nature can make LSTMs slower for long sequences because they require processing each token in the sequence one after the other.

In contrast, the Transformer model (https://youtu.be/ZXiruGOCn9s?si=1WbEjOVKBUs4Z-_J) uses the following key components to avoid loops:

1. **Self-Attention Mechanism**: Each token in the input sequence attends to all other tokens in the sequence. This allows the model to capture dependencies between tokens regardless of their distance in the sequence, in a single step.

2. **Positional Encoding**: Since the Transformer does not have a natural way to process sequences in order (like LSTMs), it adds positional encodings to the input embeddings. These encodings allow the model to understand the position of each token in the sequence.

3. **Parallelization**: By using self-attention and positional encoding, Transformers process the entire sequence simultaneously, making them much faster and more efficient on modern hardware (especially for long sequences), as they can leverage parallelism more effectively than LSTMs.

What is token above?

In the context of natural language processing (NLP) and Transformers, a **token** is a unit of text that the model processes. Tokens can represent:

1. **Words**: In word-level tokenization, each word in a sentence is treated as a token. For example, in the sentence "I love programming," the tokens would be "I," "love," and "programming."

2. **Subwords**: In subword-level tokenization, words are split into smaller components, especially for rare or complex words. For example, the word "unbelievable" might be tokenized as "un," "believ," and "able."

3. **Characters**: In character-level tokenization, each individual character is treated as a token. For instance, the word "hello" would be tokenized into 'h,' 'e,' 'l,' 'l,' 'o.'

Tokens are the basic units the model works with. Before processing text, the model converts raw input text into tokens, which are then transformed into numerical representations (embeddings) for computation.

Does transformer use a sliding window like concept?

No, Transformers do not inherently use a **sliding window** concept like traditional sequence models or convolutional models. Instead, they rely on the **self-attention mechanism** to capture relationships between tokens across the entire input sequence.

However, in certain contexts where input sequences are very long, it can be computationally expensive to apply self-attention over the entire sequence. To address this, some **Transformer variants** introduce mechanisms that resemble the sliding window approach. Examples include:

1. **Sliding Attention (Local Attention)**: Some models limit the attention mechanism to a local window around each token, rather than attending to the entire sequence. This approach reduces the computational cost by focusing on nearby tokens, similar to how a sliding window would.

2. **Segmented Attention**: In models like **Longformer** and **BigBird**, self-attention is applied over smaller segments of the sequence (local windows) while maintaining some global attention to preserve connections across distant parts of the sequence.

3. **Reformer**: This model uses **locality-sensitive hashing** (LSH) to reduce the number of tokens that need to attend to each other, which mimics focusing attention within a smaller "window."

While the basic Transformer model does not use a sliding window, these adaptations help handle long sequences efficiently by introducing window-like behavior in the attention mechanism.

วันศุกร์ที่ 22 ตุลาคม พ.ศ. 2564

R-squared value and Pearson correlation coefficient

R-squared value, denoted by R2, is squared Pearson correlation coefficient (aka Pearson's product moment coefficient) is a measure of linear correlation between two sets of data.

Its value meaning: https://condor.depaul.edu/sjost/it223/documents/correlation.htm

วันพฤหัสบดีที่ 7 ตุลาคม พ.ศ. 2564

Non-fungible token (NFT)

A unique and non-interchangeable unit of data stored on a digital ledger (blockchain). NFTs can be used to represent easily-reproducible items such as photos, videos, audio, and other types of digital files as unique items (analogous to a certificate of authenticity), and use blockchain technology to establish a verified and public proof of ownership. Copies of the original file are not restricted to the owner of the NFT, and can be copied and shared like any file. The lack of interchangeability (fungibility) distinguishes NFTs from blockchain cryptocurrencies, such as Bitcoin.

NFT ใช้ทำ DRM (digital right management)

https://en.wikipedia.org/wiki/Non-fungible_token

Fungible = interchangeable

วันพุธที่ 22 กันยายน พ.ศ. 2564

การสร้างรายได้จาก Youtube

https://noobmarketer.com/make-money-on-youtube/

วันพฤหัสบดีที่ 9 กันยายน พ.ศ. 2564

OWASP Application Security Verification Standard (ASVS)

https://owasp.org/www-project-application-security-verification-standard/?fbclid=IwAR3Z1WEAqkfyIN4_LTNFY7_AEPg7_-Zhq5ZFaKu_QTWgrcdg4YjMRuRnlpU

OWASP ZAP is a web app vulnerability scanner (https://www.zaproxy.org/download/)

วันอังคารที่ 10 สิงหาคม พ.ศ. 2564

LaTeX Editor SaaS

https://www.overleaf.com/login

วันจันทร์ที่ 9 สิงหาคม พ.ศ. 2564

Speech recognition vs Voice recognition

Speech recognition aims to understand the meaning of the speech and translate it to text.

Voice recognition aims to identify the owners of voices.

วันเสาร์ที่ 31 กรกฎาคม พ.ศ. 2564

SSH visual host key

https://askubuntu.com/questions/1028439/ssh-config-visual-host-key-yes-no-option

ยาไม่ต้องมี อย แต่ผลิตภัณฑ์เสริมอาหารต้องมี อย

ยา จะแสดงเลขทะเบียนตำรับยาแทน เช่น 1A 12/35 หมายถึง ยาแผนปัจจุบันสำหรับมนุษย์ ผลิตภายในประเทศไทย มีตัวยาออกฤทธิ์เพียงตัวเดียว ลำดับทะเบียนเลขที่ 12 ได้รับการขึ้นทะเบียนในปี พ.ศ. 2535

https://www.sanook.com/health/15785/

วันศุกร์ที่ 30 กรกฎาคม พ.ศ. 2564

Elsevier's and IEEExplore's embargo periods

https://www.elsevier.com/journals/future-generation-computer-systems/0167-739X/open-access-options

Embargo Period
For subscription articles, an appropriate amount of time is needed for journals to deliver value to subscribing customers before a manuscript becomes available for free to the public. This is called an embargo period and it begins from the date the article is formally published online in its final and fully citable form. Find out more
This journal has an embargo period of 24 months.

https://magazines.ieeeauthorcenter.ieee.org/get-started-with-ieee-magazines/publishing-ethics/guidelines-policies/post-publication-policies/

IEEE policy provides that authors are free to follow funder public access mandates to post accepted articles in repositories. When posting in a repository, the IEEE embargo period is 24 months. However, IEEE recognizes that posting requirements and embargo periods vary by funder. IEEE authors may comply with requirements to deposit their accepted manuscripts in a repository per funder requirements where the embargo is less than 24 months.

วันจันทร์ที่ 19 กรกฎาคม พ.ศ. 2564

ข้อพึงกระทำสำหรับการตีพิมพ์ในงานประชุมวิชาการ

ต้องมี proceedings (เช่น Springer) ที่ถูก indexed ใน Scopus เพราะจะไปช่วย boost Scopus' H-index ได้

วันเสาร์ที่ 19 มิถุนายน พ.ศ. 2564

ลักษณะการพูดของตถาคต

ราชกุมาร ! ตถาคตรู้ชัดซึ่งวาจาใด

อันไม่จริง ไม่แท้ ไม่ประกอบด้วยประโยชน์

และไม่เป็นที่รักที่พึงใจของผู้อื่น

ตถาคตย่อม ไม่กล่าววาจานั้น.

ตถาคตรู้ชัดซึ่งวาจาใด อันจริง อันแท้

แต่ไม่ประกอบด้วยประโยชน์ และไม่เป็นที่รักที่พึงใจของผู้อื่น

ตถาคตย่อมไม่กล่าววาจานั้น.

ตถาคตรู้ชัดซึ่งวาจาใด อันจริง อันแท้ ประกอบ

ด้วยประโยชน์ แต่ไม่เป็นที่รักที่พึงใจของผู้อื่น

ตถาคตย่อมเลือกให้เหมาะกาล เพื่อกล่าววาจานั้น.

ตถาคตรู้ชัดซึ่งวาจาใด อันไม่จริง ไม่แท้ ไม่ประกอบ

ด้วยประโยชน์ แต่เป็นที่รักที่พึงใจของผู้อื่น

ตถาคตย่อมไม่กล่าววาจานั้น.

ตถาคตรู้ชัดซึ่งวาจาใด อันจริง อันแท้

แต่ไม่ประกอบด้วยประโยชน์ แต่ก็เป็นที่รักที่พึงใจของผู้อื่น

ตถาคตย่อมไม่กล่าววาจานั้น.

ตถาคตรู้ชัดซึ่งวาจาใด อันจริง อันแท้

ประกอบด้วยประโยชน์และเป็นที่รักที่พึงใจของผู้อื่น

ตถาคตย่อมเป็นผู้ รู้จักกาละที่เหมาะเพื่อกล่าววาจานั้น.

---------------------------

- ม. ม. ๑๓/๙๑/๙๔.

วันศุกร์ที่ 18 มิถุนายน พ.ศ. 2564

pedagogy

วิธีการสอน

วันพฤหัสบดีที่ 17 มิถุนายน พ.ศ. 2564

AI vs CI

Computational intelligence

https://cis.ieee.org/about/what-is-ci#:~:text=Computational%20Intelligence%20(CI)%20is%20the,Fuzzy%20Systems%20and%20Evolutionary%20Computation.

It's techniques:

https://www.researchgate.net/figure/Type-of-computational-intelligence-9_fig2_317263278

AI vs CI

http://www.differencebetween.net/technology/difference-between-ai-and-ci/

AI aims create intelligent machines which can exhibit intelligent behavior and which can think and learn like human beings. CI, on the other hand, is a subset of AI which emphasizes on computational paradigms that make intelligent behavior possible in natural or artificial systems in complex and changing environments.

วันเสาร์ที่ 19 มีนาคม พ.ศ. 2565

Data set repositories

วันพฤหัสบดีที่ 10 มีนาคม พ.ศ. 2565

Google meet companion mode

วันพฤหัสบดีที่ 3 มีนาคม พ.ศ. 2565

Reinforcement learning

1. Learning Through Trial and Error

2. Parameter Updates

3. Exploration vs. Exploitation

วันเสาร์ที่ 26 กุมภาพันธ์ พ.ศ. 2565

Temporal Convolutional Net (TCN)

วันศุกร์ที่ 22 ตุลาคม พ.ศ. 2564

R-squared value and Pearson correlation coefficient

วันพฤหัสบดีที่ 7 ตุลาคม พ.ศ. 2564

Non-fungible token (NFT)

วันพุธที่ 22 กันยายน พ.ศ. 2564

การสร้างรายได้จาก Youtube

วันพฤหัสบดีที่ 9 กันยายน พ.ศ. 2564

OWASP Application Security Verification Standard (ASVS)

วันอังคารที่ 10 สิงหาคม พ.ศ. 2564

LaTeX Editor SaaS

วันจันทร์ที่ 9 สิงหาคม พ.ศ. 2564

Speech recognition vs Voice recognition

วันเสาร์ที่ 31 กรกฎาคม พ.ศ. 2564

SSH visual host key

ยาไม่ต้องมี อย แต่ผลิตภัณฑ์เสริมอาหารต้องมี อย

วันศุกร์ที่ 30 กรกฎาคม พ.ศ. 2564

Elsevier's and IEEExplore's embargo periods

วันจันทร์ที่ 19 กรกฎาคม พ.ศ. 2564

ข้อพึงกระทำสำหรับการตีพิมพ์ในงานประชุมวิชาการ

วันเสาร์ที่ 19 มิถุนายน พ.ศ. 2564

ลักษณะการพูดของตถาคต

วันศุกร์ที่ 18 มิถุนายน พ.ศ. 2564

pedagogy

วันพฤหัสบดีที่ 17 มิถุนายน พ.ศ. 2564

AI vs CI

ค้นหาบล็อกนี้

คลังบทความของบล็อก