Algorithms for Dynamic Spectrum Access with Learning for Cognitive Radio

We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooperatively tries to exploit vacancies in some primary (licensed) channels whose occupancies have a Markovian evolution. We first consider the scenario where the cognitive users are aware of the distribution of the signals they receive from the primary users and they use their analog observations to track the occupancies of the various primary channels that are being monitored. For this problem, we obtain a greedy channel selection and access policy that maximizes the instantaneous reward, while satisfying a constraint on the probability of interfering with licensed transmissions. Through simulation, we show that this policy achieves substantial performance improvement relative to an existing scheme that uses ACK signals for tracking the channel occupancies.We also derive an analytical universal upper bound on the performance of the optimal policy with which we compare the performance of our scheme to further demonstrate its efficiency.
View on arXiv