The most obvious approach to breaking modern cryptosystems is to attack the underlying mathematical problem.

Factoring: given $N=\mathrm{pq},p<q,p\approx q$, find $p,q$.

Discrete logarithm: Given $p,g,{g}^{x}modp$, find $x$.
Classical Algorithms

Brute force, e.g. trial division, which has running time $O(p)=O({N}^{1/2})$.

Babystepgiantstep, PollardRho, Pollard kangaroo. All have running time $O({p}^{1/2})=O({N}^{1/4})$. (Also, these are the best known methods for solving discrete log on a general cyclic groups.)
Example: For factoring: it is known that using FFT, given $f\in {\mathbb{Z}}_{N}[x]$ of degree $d$, and given ${x}_{1},...,{x}_{d}\in {\mathbb{Z}}_{N}$, computing $f({x}_{1}),...,f({x}_{d})$ can be done in time $O(d\mathrm{log}d)$ and space $O(d)$, which implies the existence of a simple $O({N}^{1/4})$ factoring algorithm.
Modern Algorithms
Define a function
$${L}_{a,b}(N)={e}^{b(\mathrm{log}N{)}^{a}(\mathrm{log}\mathrm{log}N{)}^{1a}}$$Then note that
$${L}_{0,b}(N)=(\mathrm{log}N{)}^{b}$$which is polynomial in the number of bits in $N$, and
$${L}_{1,b}(N)={N}^{b}$$which is exponential in the number of bits in $N$. For values of $a$ in between we get subexponential functions, i.e. functions that grow faster than polynomials but slower than exponentials.
Here is a list of some factoring algorithms and their running times. With the exception of Dixon's algorithm, these running times are all obtained using heuristic arguments. We shall see that discrete logarithm algorithms for finite fields are similar.

Dixon's Algorithm: ${L}_{1/2,2}(N)={e}^{2\sqrt{\mathrm{log}N\mathrm{log}\mathrm{log}N}}$

Continued Fractions: ${L}_{1/2,\sqrt{2}}(N)={e}^{\sqrt{2}\sqrt{\mathrm{log}N\mathrm{log}\mathrm{log}N}}$

Quadratic Sieve: ${L}_{1/2,1}(N)={e}^{\sqrt{\mathrm{log}N\mathrm{log}\mathrm{log}N}}$. RSA129 was solved using this method.

Elliptic Curve: ${L}_{1/2,\sqrt{2}}(p)={L}_{1/2,1}(N)$. Unlike the other algorithms this one takes only polynomial space; the other algorithms have space bounds that are on par with their time bounds.

Number Field Sieve ['88]: ${L}_{1/3,1.902}(N)\approx {e}^{3\sqrt{\mathrm{log}N}}$. RSA512 was solved with this method.
The approach these algorithms take is to find random solutions to ${x}^{2}={y}^{2}modN$. Given such a solution, with probability $1/2$, we have that $\mathrm{gcd}(xy,N)$ or $\mathrm{gcd}(x+y,N)$ is a prime factor of $N$.
Dixon's Algorithm
The first part of the algorithm, known as the sieving step, finds many relations of a certain form. The second part, known as the linear algebra step, uses the relations to find a solution to ${x}^{2}={y}^{2}modN$.

Pick a random $x\in [1,N]$ and compute $z={x}^{2}modN$

Test if $z$ is $S$smooth, for some smoothness bound $S$, i.e. if all prime factors of $z$ are less than $S$. If so, then $z={\prod}_{i=1}^{k}{l}_{i}^{{\alpha}_{i}}$ where $k$ is the number of primes less than $S$, and record $z$

Repeat until $r$ relations are found, where $r$ is a number like $10k$.

We have $r$ relations (modulo $N$), for example:
We wish to find a subset of these relations such that the product of the righthand sides is a square, that is, all the exponents are even: let $A$ be a $k\times r$ exponent matrix, where ${A}_{\mathrm{ij}}={\alpha}_{i}$ in the $j$th relation. Then find a nonzero vector $\stackrel{\u203e}{y}\in {\mathbb{Z}}_{2}^{r}$ such that $A\cdot \stackrel{\u203e}{y}=\stackrel{\u203e}{0}$ modulo 2. Then $\stackrel{\u203e}{y}$ describes a subset of relations that will multiply to give a perfect square on the righthand side.
It remains to optimize $S$. Define Dixon's function as follows:
$$\psi (x,s)=\mid \{a\in 1,...,S\mid a\text{is}S\text{smooth}\}\mid $$Then if use the heuristic that the proportion of $S$smooth numbers amongst the possible values of $z$ is the same as the proportion of $S$smooth numbers amongst all numbers less than $N$, then
$$\psi (x,s)/x=\underset{x\in \{1,...,N\}}{\mathrm{Pr}}[x\text{is}S\text{smooth}]\approx {u}^{u}$$where $u=x/s$, a result due to de Bruijn.
The sieving step is faster when $S$ is larger, and the linear algebra step is faster when $S$ is smaller, so $S$ must be chosen carefully. It turns out the optimum value for $S$ is
$$S={L}_{1/\mathrm{2,2}}(N)$$which is also the algorithm's running time. (In fact, because of the simplicity of Dixon's algorithm, it is possible to derive these bounds nonheuristically.)
The matrix involved in the linear algebra step is sparse, and to speed up the algorithm, many specialized optimizations have been developed. Note also that it is easy to distribute the sieving step amongst many machines, and furthermore, verifying that the computed relations are correct is cheap (i.e. robustness is free unlike other distributed computation problems, e.g. SETI@home).
Quadratic Sieve
Define ${f}_{a}(x)=(x+\lfloor \sqrt{aN}{\rfloor}^{2})aN$. Then since $\mid y\lfloor \sqrt{y}{\rfloor}^{2}\mid \approx \sqrt{y}$, we have ${f}_{a}(x)\approx {x}^{2}+2x\sqrt{aN}\sqrt{aN}$. When $\mid x\mid <\sqrt{N}$ we have ${f}_{a}(x)\approx \sqrt{aN}$.
Then pick a small random $a\leftarrow \{1,...,k\}$. Find all $x\in [B,B]$ (we shall describe how to do this later) such that ${f}_{a}(x)$ is $S$smooth, where $S,B,k$ will be determined later. For such $x$ we have a relation
$$(x+\lfloor \sqrt{aN}{\rfloor}^{2})=\prod _{i=1}^{k}{l}_{i}^{{\alpha}_{i}}$$modulo $N$, and as before with enough of these we can proceed to the linear algebra step.
Note that $\mid {f}_{a}(x)\mid <\sqrt{aN}$ which means it is more probable that it is $S$smooth than an integer on the order of $N$ (which is what is required in Dixon's algorithm).
To find all suitable $x\in [B,B]$: initialize an array of integers $v$ indexed from $B$ to $B$ with zero. For each small prime ${l}_{i}$, increment $v[x]$ if ${f}_{a}(x)=0mod{l}_{i}$. Doing this requires a simple linear scan: if ${\beta}_{1},{\beta}_{2}$ are the roots of ${f}_{a}(x)$ in ${\mathbb{Z}}_{{l}_{i}}$ then for every $y$, we increment $v[y]$ if $y={\beta}_{1}$ or $y={\beta}_{2}$ modulo ${l}_{i}$.
With optimal $B,S,k$, we have that the running time is ${L}_{1/\mathrm{2,1}}(N)$ if we use the heuristic that ${f}_{a}(x)$ is uniformly distributed.
Number Field Sieve
In this method, sieving is done in number fields. Define $d=(\mathrm{log}N/\mathrm{log}\mathrm{log}N{)}^{1/3}$, and let $m=\lfloor {N}^{1/d}\rfloor $. Write $N={m}^{d}+{f}_{d1}{m}^{d1}+...+{f}_{0}$, i.e. $N$ in base $m$, and define the polynomial $f(x)={x}^{d}+{f}_{d1}{x}^{d1}+...+{f}_{0}$, so by construction $f(m)=0(modN)$.
With overwhelming probability, $f$ is irreducible, so define the field $K=\mathbb{Q}[x]/f(x)$. Then find many pairs $(a,b)$ where $0\le a,b\le {L}_{1/\mathrm{3,0.901}}(N)$ such that

$abm$ is ${L}_{1/\mathrm{3,0.901}}(N)$smooth.

${N}_{K}(abx)$ is ${L}_{1/\mathrm{3,0.901}}(N)$smooth, where ${N}_{K}$ is the norm on $K$.
It turns out each pair yields a relation modulo $N$ that can be used in the linear algebra step.
Discrete Logarithm
Suppose our input is $y={g}^{\alpha}(modp)$. Then pick a smoothness bound $S$, and proceed with index calculus:

Pick random $r,a\leftarrow {\mathbb{Z}}_{p}$ and set $z={y}^{r}{g}^{a}\mathrm{mod}p$.

Test if $z$ is $S$smooth. If so then
$${y}^{r}{g}^{a}=\prod _{i=1}^{k}{l}_{i}^{{\alpha}_{i}}$$ 
Repeat until many (e.g. $10k$) relations are obtained.
Use linear algebra to solve for ${\mathrm{log}}_{g}y=\alpha $ and each ${\mathrm{log}}_{g}{l}_{i}$.