Heading 2
Heading 3
This is a paragraph with bold, italic, strikethrough, and a link.
You just spent a ton of time estimating regressions – and, since regression is the workhorse of computational social science (even in the “deep learning era”[1]) I think it’s worthwhile and I hope you do too 😺. However, to show you that the benefits from learning to speak the PGM language extend far beyond regression, this problem quickly walks you through an example of detecting changepoints in historical time-series data.
Carlin et al. (1992) contains a data table with the number of “major coal-mining disasters” (operationalied as incidents where 10 or more miners died) per year from 1851 to 1962. Run the following code cells to plot the trend of this data over time, where you’ll maybe start to see (if you squint your eyes enough) that something(s) may have happened around the turn of the 20th century that decreased the yearly disaster rate!
As the saying goes: sharpening your axe will not delay your job of chopping wood. If you are a pure beginner in the world of computers, learning some tools will make you more efficient.
Carlin et al. (1992) contains a data table with the number of “major coal-mining disasters” (operationalied as incidents where 10 or more miners died) per year from 1851 to 1962. Run the following code cells to plot the trend of this data over time, where you’ll maybe start to see (if you squint your eyes enough) that something(s) may have happened around the turn of the 20th century that decreased the yearly disaster rate!
- Unordered list item 1
- Unordered list item 2
- Ordered list item 1
- Ordered list item 2
You just spent a ton of time estimating regressions – and, since regression is the workhorse of computational social science (even in the “deep learning era”[1]) I think it’s worthwhile and I hope you do too 😺. However, to show you that the benefits from learning to speak the PGM language extend far beyond regression, this problem quickly walks you through an example of detecting changepoints in historical time-series data.
Carlin et al. (1992) contains a data table with the number of “major coal-mining disasters” (operationalied as incidents where 10 or more miners died) per year from 1851 to 1962. Run the following code cells to plot the trend of this data over time, where you’ll maybe start to see (if you squint your eyes enough) that something(s) may have happened around the turn of the 20th century that decreased the yearly disaster rate! Inline code
You just spent a ton of time estimating regressions – and, since regression is the workhorse of computational social science (even in the “deep learning era”[1]) I think it’s worthwhile and I hope you do too 😺. However, to show you that the benefits from learning to speak the PGM language extend far beyond regression, this problem quickly walks you through an example of detecting changepoints in historical time-series data.
# This is a comment (c1)
import math # kn, nb
from typing import List # kn, nb
@staticmethod # nd (decorator)
def magic_method(x: int, y: float = 3.14) -> float: # fm, nf, bp, mi, mf
"""This is a docstring (s2)"""
result = x ** y + 42 # o, mi
print("Result is:", result) # nb, s2, nf
return result # k
class MyClass: # nc
class_var = 123 # nv
def __init__(self, value): # fm, nf, bp
self.value = value # na, nv
def double(self) -> int: # nf
return self.value * 2 # o, mi
@property # nd
def squared(self): # nf
return self.value ** 2 # o, mi
try: # k
obj = MyClass(5) # nc, nv, mi
output = magic_method(obj.double()) # nf, nb
print(f"Output: {output}") # s2, nf
except Exception as err: # k, nc, nv, err
print("Error:", err) # s2, nf, nv
# HTML-like tag for .nt (tag name)
html = "<div class='container'></div>" # nt, na, s1
# diff test
- old line (gd)
+ new line (gi)
Header 1 | Header 2 |
---|---|
Cell 1 | Cell 2 |
Cell 3 | Cell 4 |
姓名 | 年龄 | 职业 | 城市 | 邮箱 | 电话 | 备注 |
---|---|---|---|---|---|---|
张三 | 28 | 数据分析师 | 北京 | zhangsan@example.com | 13800000001 | 喜欢编程 |
李四 | 34 | 产品经理 | 上海 | lisi@example.com | 13800000002 | 爱好摄影 |
王五 | 25 | 前端开发 | 广州 | wangwu@example.com | 13800000003 | 篮球爱好者 |
赵六 | 30 | 后端开发 | 深圳 | zhaoliu@example.com | 13800000004 | 旅行达人 |
钱七 | 27 | 运维工程师 | 杭州 | qianqi@example.com | 13800000005 | 吉他初学者 |
孙八 | 29 | UI设计师 | 成都 | sunba@example.com | 13800000006 | 喜欢画画 |
周九 | 32 | 测试工程师 | 西安 | zhoujiu@example.com | 13800000007 | 跑步爱好者 |
吴十 | 26 | 数据科学家 | 南京 | wushi@example.com | 13800000008 | 电影迷 |
郑十一 | 31 | 架构师 | 重庆 | zheng11@example.com | 13800000009 | 读书爱好者 |
王十二 | 33 | 项目经理 | 苏州 | wang12@example.com | 13800000010 | 美食爱好者 |

You just spent a ton of time estimating regressions – and, since regression is the workhorse of computational social science (even in the “deep learning era”[1]) I think it’s worthwhile and I hope you do too 😺. However, to show you that the benefits from learning to speak the PGM language extend far beyond regression, this problem quickly walks you through an example of detecting changepoints in historical time-series data.
Carlin et al. (1992) contains a data table with the number of “major coal-mining disasters” (operationalied as incidents where 10 or more miners died) per year from 1851 to 1962. Run the following code cells to plot the trend of this data over time, where you’ll maybe start to see (if you squint your eyes enough) that something(s) may have happened around the turn of the 20th century that decreased the yearly disaster rate!
Heading 1 in small
Heading 2 in small
Heading 3 in small
This is a paragraph inside small, with bold, italic, strikethrough, and a link.
This is a blockquote inside small.
- Unordered list item 1
- Unordered list item 2
- Ordered list item 1
- Ordered list item 2
Inline code in small
# Code block in small
import re
from collections import Counter
def clean_text(text):
# Convert to lowercase, remove non-alphabetic characters
return re.findall(r'\b[a-z]+\b', text.lower())
def count_words(text):
words = clean_text(text)
return Counter(words)
def print_top_words(word_counts, top_n=10):
print(f"Top {top_n} most common words:")
for word, count in word_counts.most_common(top_n):
print(f"{word}: {count}")
def main():
sample_text = """
Python is powerful... and fast;
plays well with others;
runs everywhere;
is friendly & easy to learn;
is Open.
"""
word_counts = count_words(sample_text)
print_top_words(word_counts)
if __name__ == "__main__":
main()
Header 1 | Header 2 |
---|---|
Cell 1 | Cell 2 |
Cell 3 | Cell 4 |
- As a person who professes about ML and stats at Carnegie Mellon put it back in 2017, “The dirty secret of the field, and of the current hype, is that 90% of machine learning is a rebranding of nonparametric regression.” Many (most?) statisticians – at least, among the biased sample of statisticians I know – would likely still agree with this statement in 2025!