The problem of finding the maximum (or minimum) of a real function is often encountered in reallife problems. This is especially true when the function describes how close a solution is to some preestablished aim.
We as humans have the natural instinct of searching for better ways to do things:
 finding a shorter route to work
 earning a higher grade while studying less
 finding a balance between all areas of our lives which leads to highest level of happiness
 and the list goes on
Naturally, not all of these targets are easily quantifiable or achievable, but constructing an appropriate model for such a complex problem is the first step towards solving it.
The textbook way
There are many known methods of finding the maximum (or minimum) of a real function.
Some of the more popular ones  Gradient Descent, for instance  became known in the mainstream CS community due to their applicability in Machine Learning and their simplicity of implementation:
function [theta, J_history] =
gradientDescent(X, y, theta, alpha, num_iters)
% GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESENT(X, y, theta, alpha, num_iters)
% updates theta by taking num_iters gradient steps with
% learning rate alpha
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
rsum_0 = 0;
rsum_1 = 0;
for i=1:m,
rsum_0 += (X(i,:) * theta  y(i)) * X(i, 1);
rsum_1 += (X(i,:) * theta  y(i)) * X(i, 2);
end
theta = [theta(1)  (alpha / m) * rsum_0;
theta(2)  (alpha / m) * rsum_1];
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
end
The problem with such methods is their mathematical complexity. It is considerably more difficult to understand the inner workings and intuition behind gradient descent than it is to actually implement it.
Coming up with the algorithm from first principles requires a strong mathematical background. Just opening the Wikipedia article is enough to scare most programmers away, when they encounter this:
A more intuitive way
Luckily, there is a class of heuristicbased optimization techniques which perform onpar with the mathematically proven™ methods, when tuned properly. Most of these algorithms are heavily inspired from natural phenomena, ranging from the movement of bees in a colony, to Darwin's theory of evolution by natural selection.
Particle Swarm Optimization
This particular technique was initially designed as a model to represent the movement of flocks of birds or groups of fish. Later analysis showed that the algorithm was performing optimization.
Assume that we want to minimize a \(2\)dimensional real function (this method also works for \(n\)dimensional functions). The main idea behind PSO can be described as follows:
 A particle represents a potential solution (i.e. a pair \((x, y)\) ). From this point forward, we will be refering to this pair as the position of the particle.
 Additionally, each particle is assigned a velocity (i.e. a pair \((dx, dy)\)).
 Initially, a swarm of \(N\) randomly placed (or uniformly distributed) particles is generated in the function domain.
 The fitness of each particle (i.e. how well it performs as a solution) is computed; this is equivalent to the actual value of the given function at the position of each particle. Our aim is to minimize this fitness.
 Each particle keeps track of its best previously encountered position. In addition, we also keep track of the best position ever encountered by any particle in the swarm.
 Each particle moves to a new position, by adding its velocity to the current position.
 The velocity is then recomputed as a weighted average of three factors:
 current velocity (weight \(\omega\))
 the best position of the particle (weight \(\phi_{p_{i}}\))
 the best position of the entire swarm (weight \(\phi_{G}\))
# p and g are randomly generated in [1.0, 1.0] # they act as mutations def updateVelocity(self, i, p, g): deltaParticle = self.bestPos[i]  self.pos[i] deltaSwarm = self.swarm.bestPos[i]  self.pos[i] newComponent = omega * self.velocity[i] \ + phiP * p * deltaParticle \ + phiG * g * deltaSwarm if i == 0: self.velocity = (newComponent, self.velocity[1]) else: self.velocity = (self.velocity[0], newComponent)
 The best position for each particle and for the colony are updated.
There is a bit of work involved in choosing appropriate values for \(\omega\), \(\phi_{p_{i}}\) and \(\phi_{G}\), but this job becomes considerably less difficult after understanding how each of these parameters affects the colony.
This set of rules leads to a tremendously organic behaviour. There is no magic behind the scenes; it's just particles that explore the problem domain while learning from their past experience and from each other.
Testing
Wikipedia provides a list of functions designed specifically for evaluating the performance of optimization algorithms.
These are the results I obtained on a subset of them:
Function name  $$ Eggholder $$ 

Formula  $$ f(x, y) = (y+47) * sin \sqrt{\left  \frac{x}{2} + (y+47) \right }  x * sin \sqrt{\left  x  (y+47) \right } $$ 
Global minimum  $$ f(512, 404.2319) = 959.6407 $$ 
Best solution found  $$ f(482.03561, 432.59104) = 956.879905 $$ 
Plot  
Simulation  
Notes 

Function name  $$ Cross{\hbox{}}in{\hbox{}} tray $$ 

Formula  $$ f(x, y) = (10^{4}) \left [ \left  sin\left ( x \right ) sin\left ( y \right ) exp \left ( \left  100  \frac{\sqrt{x^{2}+y^{2}}}{\pi} \right  \right ) \right  + 1 \right ]^{0.1} $$ . 
Global minimum  $$ f(\pm1.34941, \pm1.34941) = 2.06261 $$ 
Best solution found  $$ f(1.34830, 1.34880) = 2.062612 $$ 
Plot  
Simulation  
Notes 

Conclusion
Optimization techniques usually behave really well on real functions, but they are not limited to this scope. I chose this application because it can provide insightful data visualization more easily than other use cases.
The code can be found on GitHub.