วันอาทิตย์ที่ 28 มิถุนายน พ.ศ. 2569

Techniques for public accessing to private-IP servers

  • Peer-to-peer overlay network Services 

ZeroTierTailscale are not SSL VPN. They use UDP hole punching.

UDP hole punching is a technique that allows two devices behind NAT routers (such as home routers) to establish a direct peer-to-peer connection without requiring manual port forwarding.

Here’s how it works:

  1. Both devices contact a public coordination server
    • Suppose Device A is at your home and Device B is in another organization.
    • Both devices first communicate with a publicly reachable server operated by the VPN service (e.g., ZeroTier).
  2. The coordination server learns their public addresses
    • The server observes the public IP address and UDP port assigned by each device’s NAT router.
  3. The server tells each device how to reach the other
    • Device A learns Device B’s public IP and port, and vice versa.
  4. Both devices simultaneously send UDP packets to each other
    • When Device A sends a packet to Device B, its NAT router creates a temporary mapping (a “hole”) allowing return traffic.
    • Device B does the same.
    • Because both sides have opened these temporary holes, the packets can pass through the NATs, establishing a direct connection.

Device A ── NAT A ── Internet ── NAT B ── Device B

     ↑                                          ↑

     └───── simultaneous UDP packets ─────────┘

Why is it called “hole punching”?

Normally, NAT routers block unsolicited incoming packets. By sending outgoing UDP packets first, each device creates a temporary opening (“hole”) in its NAT table that allows packets from the other device to enter.

Advantages

  • No need to configure port forwarding on routers.
  • Enables direct peer-to-peer communication.
  • Lower latency than relaying traffic through a central server.

Limitations

UDP hole punching does not work with all NAT types. It usually succeeds with:

  • Full-cone NAT
  • Restricted-cone NAT
  • Port-restricted cone NAT

It may fail with:

  • Symmetric NAT (common in some enterprise networks and cellular networks)

When hole punching fails, services such as ZeroTierTailscaleand WebRTC applications often fall back to relaying traffic through intermediary servers.

The coordination protocol commonly used to discover public addresses is based on the STUN standard, while relay fallback often uses TURN servers.

2. Cloudflare tunnels 

It’s a free service requiring registered DNS name. Cloudflare Tunnel: does not use hole punching. It relies on the private server maintaining a long-lived outbound connection to Cloudflare edge server.

วันเสาร์ที่ 20 มิถุนายน พ.ศ. 2569

CNN based on spatiotemporal features

CRNN (Convolutional Recurrent Neural Network) and STGCN (Spatio-Temporal Graph Convolutional Network) are both deep learning architectures used to process spatio-temporal data (like videos or time-series networks). The main difference is how they model space: CRNNs treat spatial features as an image grid (using CNNs), while STGCNs treat space as an interconnected topology of specific points (using Graph Neural Networks).

CRNN (Convolutional Recurrent Neural Network)
CRNNs combine Convolutional Neural Networks (CNNs) for spatial feature extraction with Recurrent Neural Networks (RNNs) like LSTMs or GRUs for sequence processing. 
  • Spatial Processing: Applies 2D or 3D CNNs to extract abstract feature representations from regular grid data (like video frames or pixels).
  • Temporal Processing: Uses recurrent memory cells to capture dependencies over time.
  • Common Use Cases: Video classification, image captioning, optical character recognition (OCR), and audio/speech recognition. 
STGCN (Spatio-Temporal Graph Convolutional Network)
STGCNs apply Graph Convolutional Networks (GCNs) to handle non-Euclidean spatial data and pair them with Temporal Convolutional Networks (TCNs) or similar operations for the time domain. 
  • Spatial Processing: Treats data as a graph where entities are "nodes" and relationships are "edges" (e.g., tracking human skeleton joints like hands and knees).
  • Temporal Processing: Processes time sequences in parallel or via temporal convolutions rather than looping sequentially through an RNN.
  • Common Use Cases: Human action recognition using skeleton poses, traffic forecasting, and traffic-flow modeling. 

วันพฤหัสบดีที่ 18 มิถุนายน พ.ศ. 2569

Optimization of model parameters vs hyper parameters

## 1. Model Parameter Optimization Methods

These methods are the actual **optimizing algorithms** that update the internal weights (w) and biases (b) of a model during the training phase based on the calculated gradients.

### First-Order Optimization (Gradient-Based)

 * **Stochastic Gradient Descent (SGD):** The foundational method. It calculates the gradient of the loss function for a small batch (or a single sample) and takes a step in the direction of the steepest descent.

 * **Momentum:** An extension of SGD that accelerates the optimization by adding a fraction of the previous step's update vector. This helps "roll" past local minima and dampens oscillations.

 * **Adam (Adaptive Moment Estimation):** The current industry standard for deep learning. It computes adaptive learning rates for each individual parameter by tracking both the first moment (the mean) and the second moment (the uncentered variance) of the gradients.

### Second-Order Optimization (Curvature-Based)

 * **L-BFGS (Limited-memory Broyden–Fletcher–Goldfarb–Shanno):** A quasi-Newton method that estimates the Hessian matrix (the second derivative of the loss function). It is computationally heavy but highly effective for smaller datasets and traditional algorithms like logistic regression or CRFs.

## 2. Hyperparameter Optimization (HPO) Methods

These are the macro-level strategies used to search for the best external configurations (e.g., finding the best learning rate, number of layers, or dropout rate) *before* the inner parameter training loop begins.

### Traditional/Exhaustive Search

 * **Grid Search:** As discussed, it performs an exhaustive search over a manually specified grid of discrete values.

   * *Example:* Testing every combination of learning rates [0.1, 0.01] and batch sizes [32, 64].

 * **Random Search:** Instead of checking every single point on a grid, it randomly samples configurations from a specified statistical distribution over a fixed number of iterations. It is mathematically proven to be more efficient than grid search because it doesn't waste time evaluating unimportant hyperparameters.

### Informed/Sequential Search

 * **Bayesian Optimization:** A smart, sequential strategy. It builds a probabilistic model (a "surrogate model," often using Gaussian Processes) of the objective function based on past evaluation results. It uses this model to mathematically predict which hyperparameter combination is most promising to try next, balancing exploration and exploitation.

### Heuristic & Evolutionary Algorithms

 * **Genetic Algorithms (GA):** A population of hyperparameter sets is initialized. The best-performing sets are selected to "reproduce" (combine metrics) and undergo random "mutation" to create the next generation of hyperparameters.

### Early-Stopping Based Methods

 * **Hyperband:** An advanced variation of random search that uses a "successive halving" approach. It starts many training runs with random configurations simultaneously but only allocates a tiny resource budget (e.g., a few epochs) to them initially. It aggressively terminates poor performers early and funnels the remaining training budget into the most promising setups.

### Summary of the Workflow Hierarchy

```

[ Hyperparameter Optimization (e.g., Bayesian Optimization) ]

       │

       ▼  Chooses a setup (e.g., Learning Rate = 0.001)

       │

   ┌───┴───────────────────────────────────────────┐

   │ Inner Loop: Training Phase                    │

   │                                               │

   │ [ Model Parameter Optimization (e.g., Adam) ] │

   │       │                                       │

   │       ▼  Updates Weights and Biases           │

   │   (Minimizes Loss Function on Data)           │

   └───────────────────────────────────────────────┘


```


วันเสาร์ที่ 13 มิถุนายน พ.ศ. 2569

Active Learning

 คือ กระบวนการจัดการเรียนรู้เชิงรุกที่เน้นให้ผู้เรียนมีส่วนร่วม ลงมือปฏิบัติจริง และคิดวิเคราะห์ด้วยตนเอง เปลี่ยนจากการเป็นผู้รับสาร (นั่งฟังครูสอนเพียงอย่างเดียว) มาเป็นผู้สร้างองค์ความรู้ผ่านกิจกรรมต่างๆ เช่น การระดมสมอง การทำโครงงาน และการอภิปราย

สามารถแบ่งความเข้าใจในแนวคิดนี้ออกเป็น 3 ส่วนหลัก ดังนี้ครับ:
🌟 บทบาทที่เปลี่ยนไป
  • ผู้เรียน: เป็นศูนย์กลางของการเรียนรู้ มีหน้าที่คิด วิเคราะห์ ลงมือทำ และแลกเปลี่ยนความคิดเห็นกับเพื่อน
  • ผู้สอน: เปลี่ยนจาก "ผู้บรรยาย" มาเป็น "ผู้อำนวยความสะดวก" (Facilitator) หรือโค้ช คอยให้คำปรึกษาและสร้างแรงบันดาลใจ
💡 ตัวอย่างรูปแบบกิจกรรม
  • การเรียนรู้แบบใช้ปัญหาเป็นฐาน (Problem-Based Learning - PBL): ให้ผู้เรียนร่วมกันแก้โจทย์ปัญหาหรือสถานการณ์จำลอง
  • การระดมสมอง (Brainstorming): แลกเปลี่ยนความคิดเห็นอย่างอิสระเพื่อหาคำตอบร่วมกัน
  • การเรียนรู้แบบร่วมมือ (Cooperative Learning): แบ่งกลุ่มทำงาน ทำโครงงาน หรือจับคู่ทบทวนความรู้ (Think-Pair-Share)
  • การจำลองสถานการณ์ (Role-playing): สวมบทบาทสมมติเพื่อทำความเข้าใจเนื้อหาหรือผลกระทบต่างๆ อย่างลึกซึ้ง
🎯 ประโยชน์ของการเรียนรู้
  • ช่วยให้ผู้เรียนจดจำเนื้อหาได้ยาวนานขึ้น (เพราะได้ลงมือทำจริง)
  • พัฒนาทักษะการคิดขั้นสูง เช่น การวิเคราะห์ การสังเคราะห์ และการแก้ปัญหา
  • ส่งเสริมทักษะทางสังคม เช่น การทำงานร่วมกับผู้อื่นและการสื่อสาร

วันพุธที่ 10 มิถุนายน พ.ศ. 2569

Rising of Quantum computing

“Why is quantum computing becoming feasible now after being proposed decades ago?”

The idea is old

The foundations of quantum computing were developed in the 1980s and 1990s by researchers such as Richard Feynman, David Deutsch, and Peter Shor.

Important milestones:

  • 1981: Feynman proposed simulating quantum systems with quantum machines.
  • 1994: Shor discovered an algorithm that could factor large numbers exponentially faster than known classical methods.
  • 1996: Grover introduced a quantum search algorithm.

These discoveries generated enormous excitement.

Why didn’t it take off immediately?

Because building a quantum computer is extraordinarily difficult.

A classical bit is either:

  • 0
  • 1

A quantum bit (qubit) can exist in a quantum state involving both possibilities simultaneously.

The problem is that qubits are extremely fragile:

  • Heat destroys quantum states.
  • Electromagnetic noise causes errors.
  • Vibrations cause decoherence.
  • Measurement collapses the state.

For decades, researchers knew the theory but could not build machines large enough to be useful.

What changed recently?

1. Better hardware engineering

Researchers learned how to manufacture and control:

  • Superconducting qubits
  • Trapped-ion qubits
  • Neutral-atom qubits
  • Photonic qubits

Companies such as  IBM Quantum⁠,  Google Quantum AI⁠,  IonQ⁠, and  Quantinuum⁠ have demonstrated increasingly larger and more reliable quantum processors.

2. Advances in error correction

A practical quantum computer may require thousands or even millions of physical qubits to create a much smaller number of reliable logical qubits.

For many years, error correction was mostly theoretical. Recently, experimental demonstrations have shown that logical qubits can become more reliable as more physical qubits are added, an important milestone.

3. Improved cryogenic and control systems

Many quantum computers operate near absolute zero:

  • Room temperature ≈ 300 K
  • Quantum processors ≈ 0.01 K

Advances in refrigeration, microwave electronics, and precision control have made experiments possible at larger scales.

4. Significant investment

Governments and industry have invested billions of dollars because quantum computing could potentially impact:

  • Cryptography
  • Materials science
  • Drug discovery
  • Optimization
  • Quantum chemistry

Why isn’t quantum computing as popular as AI?

Because quantum computing still lacks a “ChatGPT moment.”

AI became popular when ordinary people could immediately see value:

  • Writing text
  • Generating images
  • Coding assistance
  • Translation

Quantum computers currently:

  • Are expensive.
  • Have limited numbers of high-quality qubits.
  • Remain primarily research tools.
  • Solve only a narrow range of problems better than classical computers.

Most people cannot yet use a quantum computer to improve their daily work.

A useful analogy

If AI in 2026 is like the internet around 2010—already transforming daily life—then quantum computing is more like the internet around 1975:

  • The underlying science is real.
  • Experts know it is important.
  • Significant prototypes exist.
  • Commercial potential is visible.
  • But widespread practical use is still emerging.

Quantum computing may eventually become revolutionary, but unlike AI, it has not yet reached the stage where the average person can benefit from it directly.

—ChatGPT