MultiAgentDecisionProcess
|
MDPPolicyIteration implements policy iteration for MDPs via GPU. More...
#include <MDPPolicyIteration.h>
Public Member Functions | |
double | GetQ (Index time_step, Index sI, Index jaI) const |
Get Q-value for finite-horizon case. More... | |
double | GetQ (Index sI, Index jaI) const |
Get Q-value for infinite-horizon case. More... | |
QTable | GetQTable (Index time_step) const |
QTables | GetQTables () const |
MDPPolicyIteration () | |
(default) Constructor More... | |
MDPPolicyIteration (const PlanningUnitDecPOMDPDiscrete &pu) | |
void | Plan () |
void | PlanSlow () |
Uses the GetTransitionProbability() interface, which is slow. More... | |
void | PlanWithCache (bool computeIfNotCached=true) |
void | PlanWithCache (const std::string &filenameCache, bool computeIfNotCached=true) |
void | SetQTable (const QTable &Q, Index time_step) |
void | SetQTables (const QTables &Qs) |
~MDPPolicyIteration () | |
Destructor. More... | |
Public Member Functions inherited from MDPSolver | |
Index | GetMaximizingAction (Index time_step, Index sI) |
const PlanningUnitDecPOMDPDiscrete * | GetPU () const |
Returns a ref to the PlanningUnit. More... | |
virtual double | GetQ (Index time_step, const JointBeliefInterface &jb, Index jaI) const |
virtual double | GetQ (const JointBeliefInterface &jb, Index jaI) const |
void | LoadQTable (const std::string &filename, QTable &Q) |
void | LoadQTables (const std::string &filename, int nrTables, QTables &Qs) |
MDPSolver () | |
(default) Constructor More... | |
MDPSolver (const PlanningUnitDecPOMDPDiscrete &pu) | |
void | Print () const |
void | SetPU (const PlanningUnitDecPOMDPDiscrete &pu) |
virtual | ~MDPSolver () |
Destructor. More... | |
Public Member Functions inherited from TimedAlgorithm | |
void | AddTimedEvent (const std::string &id, clock_t duration) |
Adds event of certain duration, e.g., an external program call. More... | |
std::vector< double > | GetTimedEventDurations (const std::string &id) |
Returns all stored durations (in s) for a particular event. More... | |
void | LoadTimers (const std::string &filename) |
Load timing info from file filename. More... | |
void | PrintTimers () const |
Print stored timing info. More... | |
void | PrintTimersSummary () const |
Sums data and prints out a summary. More... | |
void | SaveTimers (const std::string &filename) const |
Save collected timing info to file filename. More... | |
void | SaveTimers (std::ofstream &of) const |
Save collected timing info to ofstream of. More... | |
void | StartTimer (const std::string &id) const |
Start to time an event identified by id. More... | |
void | StopTimer (const std::string &id) const |
Stop to time an event identified by id. More... | |
TimedAlgorithm () | |
(default) Constructor More... | |
virtual | ~TimedAlgorithm () |
Destructor. More... | |
Private Member Functions | |
void | Initialize () |
template<class M > | |
void | Plan (std::vector< const M * > T) |
Vector<const M*> T is the vector of matrices specifying the transition model (one matrix for each joint action). More... | |
Private Attributes | |
bool | _m_finiteHorizon |
Are we solving a finite-horizon problem? More... | |
bool | _m_initialized |
Is the MDPValueIteration object initialized?. More... | |
QTables | _m_QValues |
_m_QValues represents the non-stationary MDP Q function. More... | |
MDPPolicyIteration implements policy iteration for MDPs via GPU.
|
inline |
(default) Constructor
MDPPolicyIteration::MDPPolicyIteration | ( | const PlanningUnitDecPOMDPDiscrete & | pu | ) |
References _m_initialized.
MDPPolicyIteration::~MDPPolicyIteration | ( | ) |
Destructor.
Get Q-value for finite-horizon case.
Implements MDPSolver.
Get Q-value for infinite-horizon case.
Implements MDPSolver.
Implements MDPSolver.
References _m_QValues.
|
virtual |
Implements MDPSolver.
References _m_QValues.
|
private |
References _m_finiteHorizon, _m_initialized, _m_QValues, PlanningUnit::GetHorizon(), PlanningUnitMADPDiscrete::GetNrJointActions(), PlanningUnitMADPDiscrete::GetNrStates(), MDPSolver::GetPU(), Globals::MAXHORIZON, TimedAlgorithm::StartTimer(), and TimedAlgorithm::StopTimer().
Referenced by PlanSlow().
|
private |
Vector<const M*> T is the vector of matrices specifying the transition model (one matrix for each joint action).
|
virtual |
Implements MDPSolver.
References PlanSlow().
void MDPPolicyIteration::PlanSlow | ( | ) |
Uses the GetTransitionProbability() interface, which is slow.
References _m_finiteHorizon, _m_initialized, PlanningUnitDecPOMDPDiscrete::GetDiscount(), PlanningUnit::GetHorizon(), PlanningUnitMADPDiscrete::GetNrJointActions(), PlanningUnitMADPDiscrete::GetNrStates(), MDPSolver::GetPU(), PlanningUnitDecPOMDPDiscrete::GetReward(), PlanningUnitMADPDiscrete::GetTransitionProbability(), Initialize(), TimedAlgorithm::PrintTimersSummary(), TimedAlgorithm::StartTimer(), and TimedAlgorithm::StopTimer().
Referenced by Plan().
|
virtual |
Implements MDPSolver.
|
virtual |
Implements MDPSolver.
Implements MDPSolver.
References _m_QValues.
|
virtual |
Implements MDPSolver.
References _m_QValues.
|
private |
Are we solving a finite-horizon problem?
Referenced by Initialize(), and PlanSlow().
|
private |
Is the MDPValueIteration object initialized?.
Referenced by Initialize(), MDPPolicyIteration(), and PlanSlow().
|
private |
_m_QValues represents the non-stationary MDP Q function.
I.e. _m_QValues[t][sI][jaI] gives the expected reward at time-step t (time-to-go = horizon - t).
Referenced by GetQTable(), GetQTables(), Initialize(), SetQTable(), and SetQTables().