Troy Cheng / 2024-07-08

Heading 2

Heading 3

This is a paragraph with bold, italic, ~~strikethrough~~, and a link.

You just spent a ton of time estimating regressions – and, since regression is the workhorse of computational social science (even in the “deep learning era”^[1]) I think it’s worthwhile and I hope you do too 😺. However, to show you that the benefits from learning to speak the PGM language extend far beyond regression, this problem quickly walks you through an example of detecting changepoints in historical time-series data.

As the saying goes: sharpening your axe will not delay your job of chopping wood. If you are a pure beginner in the world of computers, learning some tools will make you more efficient.

Unordered list item 1
Unordered list item 2

Ordered list item 1
Ordered list item 2

Carlin et al. (1992) contains a data table with the number of “major coal-mining disasters” (operationalied as incidents where 10 or more miners died) per year from 1851 to 1962. Run the following code cells to plot the trend of this data over time, where you’ll maybe start to see (if you squint your eyes enough) that something(s) may have happened around the turn of the 20th century that decreased the yearly disaster rate! Inline code You just spent a ton of time estimating regressions – and, since regression is the workhorse of computational social science (even in the “deep learning era”^[1]) I think it’s worthwhile and I hope you do too 😺. However, to show you that the benefits from learning to speak the PGM language extend far beyond regression, this problem quickly walks you through an example of detecting changepoints in historical time-series data.

# This is a comment (c1)

import math  # kn, nb
from typing import List  # kn, nb

@staticmethod  # nd (decorator)
def magic_method(x: int, y: float = 3.14) -> float:  # fm, nf, bp, mi, mf
    """This is a docstring (s2)"""
    result = x ** y + 42  # o, mi
    print("Result is:", result)  # nb, s2, nf
    return result  # k

class MyClass:  # nc
    class_var = 123  # nv
    def __init__(self, value):  # fm, nf, bp
        self.value = value  # na, nv

    def double(self) -> int:  # nf
        return self.value * 2  # o, mi

    @property  # nd
    def squared(self):  # nf
        return self.value ** 2  # o, mi

try:  # k
    obj = MyClass(5)  # nc, nv, mi
    output = magic_method(obj.double())  # nf, nb
    print(f"Output: {output}")  # s2, nf
except Exception as err:  # k, nc, nv, err
    print("Error:", err)  # s2, nf, nv

# HTML-like tag for .nt (tag name)
html = "<div class='container'></div>"  # nt, na, s1

# diff test
- old line (gd)
+ new line (gi)

Header 1	Header 2
Cell 1	Cell 2
Cell 3	Cell 4

姓名	年龄	职业	城市	邮箱	电话	备注
张三	28	数据分析师	北京	zhangsan@example.com	13800000001	喜欢编程
李四	34	产品经理	上海	lisi@example.com	13800000002	爱好摄影
王五	25	前端开发	广州	wangwu@example.com	13800000003	篮球爱好者
赵六	30	后端开发	深圳	zhaoliu@example.com	13800000004	旅行达人
钱七	27	运维工程师	杭州	qianqi@example.com	13800000005	吉他初学者
孙八	29	UI设计师	成都	sunba@example.com	13800000006	喜欢画画
周九	32	测试工程师	西安	zhoujiu@example.com	13800000007	跑步爱好者
吴十	26	数据科学家	南京	wushi@example.com	13800000008	电影迷
郑十一	31	架构师	重庆	zheng11@example.com	13800000009	读书爱好者
王十二	33	项目经理	苏州	wang12@example.com	13800000010	美食爱好者

Image Example

Heading 1 in small

Heading 2 in small

Heading 3 in small

This is a paragraph inside small, with bold, italic, ~~strikethrough~~, and a link.

This is a blockquote inside small.

Unordered list item 1
Unordered list item 2

Ordered list item 1
Ordered list item 2

Inline code in small

# Code block in small
import re
from collections import Counter

def clean_text(text):
    # Convert to lowercase, remove non-alphabetic characters
    return re.findall(r'\b[a-z]+\b', text.lower())

def count_words(text):
    words = clean_text(text)
    return Counter(words)

def print_top_words(word_counts, top_n=10):
    print(f"Top {top_n} most common words:")
    for word, count in word_counts.most_common(top_n):
        print(f"{word}: {count}")

def main():
    sample_text = """
    Python is powerful... and fast;
    plays well with others;
    runs everywhere;
    is friendly & easy to learn;
    is Open.
    """
    word_counts = count_words(sample_text)
    print_top_words(word_counts)

if __name__ == "__main__":
    main()

Header 1	Header 2
Cell 1	Cell 2
Cell 3	Cell 4

Image Example

As a person who professes about ML and stats at Carnegie Mellon put it back in 2017, “The dirty secret of the field, and of the current hype, is that 90% of machine learning is a rebranding of nonparametric regression.” Many (most?) statisticians – at least, among the biased sample of statisticians I know – would likely still agree with this statement in 2025!

#notebook #test #AI

Last modified on 2025-07-13 • Suggest an edit of this page

Markdown Feature Test

Heading 2

Heading 3

Heading 1 in small

Heading 2 in small

Heading 3 in small