- Riak Generated Key Time Hostname Server
- Riak Generated Key Time Hostname Name
- Key Time Realty
- Riak Generated Key Time Hostname List
- Riak Generated Key Time Hostname Server
root@i-157-16777-VM:# sudo service riak start Riak failed to start within 15 secondssee the output of 'riak console' for more information. If you want to wait longer, set the environment variable WAITFORERLANG to the number of seconds to wait. riak-admin status grep ringmembers yield no resutls I am now at a.
I have a riak cluster set up with 3 servers. I can look at the bitcask to establish how much disk space this cluster is currently using but I'd also like to find out how many items are currently being stored in the cluster. The cluster is being used to store images, meaning that binary data is being stored against a key in a set of buckets. Oct 24, 2019 Riak is a distributed, decentralized data storage system. In the wiki, you will find the “quick start” directions for setting up and using Riak. For more information, browse the following files. The Riak client for Ruby. Contribute to basho/riak-ruby-client development by creating an account on GitHub.
At the end of this guide, you should be familiar with:
- Working with buckets and bucket properties
- Listing buckets and keys
- Fetching, storing, and deleting values
- Using value metadata
This and the other guides in this wiki assume you have Riak installedlocally. If you don't have Riak already, please read and followhow to install Riak andthen come back to this guide. If you haven't yet installed the clientlibrary, please do so before starting this guide.
This guide also assumes you know how toconnect the client to Riak. All examples assumea local variable of
client
which is an instance of Riak::Client
and points at some Riak node you want to work with.Key-Value
Riak is a near-purekey-value store, whichmeans that you have no tables, collections, or databases. Each keystands on its own, independent of all the others. Keys are namespacedin Buckets, which alsoencapsulate properties which are common among all keys in thenamespace (for example, the replication factor
n_val
). If youconceived of Riak as a big distributed Hash
, it might look likethis:What's missing from that picture:
- Each pair is given a specific spot within a cluster of Riak nodes,so any node in the cluster can find it without having to ask. Thisis also known as'consistent hashing'.
- There are 3 copies of the pair by default.
- The
value
has metadata that you can manipulate as well as the rawvalue. - A key may have multiple values in the case of race-conditions andnetwork errors. (Don't worry about this for now, but do readResolving Conflicts when you're ready.)
Enough with the exposition, let's look at some data!
Buckets
Since our keys are all grouped into buckets, in the Ruby client, weget a
Riak::Bucket
object before doing any key-value operations.Here's how to get one:This gives a
Riak::Bucket
object with the name 'guides'
that islinked to the Riak::Client
instance.'But wait', you say, 'doesn't that bucket need to exist in Riakalready? How do we know which bucket to request?' Buckets are virtualnamespaces as we mentioned above; they have no schema, no manifestother than any properties you set on them (see below) and so you don'tneed to explicitly create them. In fact, the above code doesn't eventalk to Riak! Generally, we pick bucket names that have meaning to ourapplication, or are chosen for us automatically by another frameworklike Ripple orRisky.
Listing buckets
If you are just starting out and don't know which buckets have data inthem, you can use the 'list buckets' feature. Note that this willgive you a warning about its usage with a backtrace. You shouldn't runthis operation in your application.
Looks like we don't have any buckets stored. Why? We haven't storedany data yet! Riak gets the list of buckets by examining all the keysfor unique bucket names.
Listing keys
You can also list the keys that are in the bucket to know what'sthere. Again, this is another operation that is for experimentationonly and has horrible performance in production. Don't do it.
You can 'stream' keys through a block (where the block will be passedan Array of keys as the server sends them in chunks), which isslightly more efficient for large key lists, but we'll skip that fornow. Check out theAPI docs for moreinformation.
Bucket properties
Earlier we alluded to bucket properties. If you want to grab theproperties from a bucket, call the
props
method (which is alsoaliased to properties
).There are a lot of things in this Hash that we don't need to careabout. The most commonly-used properties are detailed on theRiak wiki. Let's set thereplication factor,
n_val
.A number of the most common properties are exposed directly on the
Bucket
object like shown above. Note that you can pass an incompleteHash
of properties, and only the properties that are part of theHash
will be changed.The other bucket property we might care about is
allow_mult
, whichallows your application todetect and resolve conflicting writes. It isalso exposed directly:Fetching and storing values
Now let's fetch a key from our bucket:
Depending on which protocol and backend you chose when connecting,you'll get an exception:
This means that the object does not exist (if you rescue the
Riak::FailedRequest
exception, its not_found?
method will returntrue
, tell you that the error represents a missing key). If you wantto avoid the error for a key you're not sure about, you can check forits existence explicitly:If you don't care whether the key exists yet or not, but want to startworking with the value so you can store it, use
get_or_new
:This gives us a new[[
Riak::RObject
|http://rdoc.info/gems/riak-client/Riak/RObject]] towork with, which is a container for the value. In Riak's terminology,the combination of bucket, key, metadata and value is called an'object' -- please do not confuse this with Ruby's concept of anObject. All 'Riak objects' are wrapped by the Riak::RObject
class. Since this is a new object, the client assumes we want to storeRuby data as JSON in Riak and so sets the content-type for us to'application/json'
, which we can see in the inspect output. Thedefault value of the object is nil
. Let's set the data to somethinguseful:Now we can persist that object to Riak using the
store
method.If we list the keys again, we can see that the key is now part of thebucket (this time we use the
bucket
accessor on the object insteadof going from the client object):Now let's fetch our object again:
Assuming we're done with the object, we can delete it:
*Note: Deleting an
RObject
will freeze the object, makingmodifications to it impossible.Working with metadata
Riak Generated Key Time Hostname Server
We mentioned before that every value in Riak also has metadata, andthe
Riak::RObject
lets you manipulate it. The only one we've reallyseen so far is the content type metadata, so let's examine that moreclosely.Content type
For the sake of interoperability and ease of working withyour data, Riak requires every value to have a content-type. Let'slook at our previous object's content type:
Under the covers, the Ruby client will automatically convert that toand from JSON when storing and retrieving the value. If we wanted toserialize our Ruby data as a different type, we can just change thecontent-type:
Now our object will be serialized to YAML. The Ruby clientautomatically supports JSON, YAML, Marshal, and plain-textserialization. (If you want to add your own, check out theSerializers guide.)
But what if the data we want to store is not a Ruby data type, butsome binary chunk of information that comes from another system. Notto worry, you can bypass serializers altogether using the
raw_data
accessors. Let's say I want to store a PNG image that I have on mydesktop. I could do it like so:When the client doesn't know how to deserialize the content type, itwill simply display the byte size on inspection. Now here's a funpart: since I just stored an image, I can open it with my browser:
User metadata
You can also specify a bunch of free-form metadata on an
RObject
using the meta
accessor, which is simply a Hash
. For example, ifwe wanted to credit the PNG image we stored above to a specificperson, we could add that and it would not affect the value of theobject:Now the next time we fetch the object, we'll get back that metadatatoo:
The values come back as Arrays because HTTP allows multiple valuesper header, and user metadata is sent as HTTP headers.
Vector clock
The Vector clock isRiak's means of internal accounting; that is, tracking differentversions of your data and automatically updating them whereappropriate. You don't usually need to worry about the vector clock,but it is accessible on the
RObject
as well:Riak Generated Key Time Hostname Name
That vector clock will automatically be threaded through anyoperations you perform directly on the
RObject
(like store
,delete
, and reload
) so that you don't have to worry about it.Last-Modified time and ETag
Especially if you're using the HTTP interface, the
last_modified
andetag
are useful. Whenreloading your object,they will be used to prevent full fetches when the object hasn'tchanged in Riak. They can also be used as a form of optimisticconcurrency control (with very weak guarantees, mind you) by settingthe prevent_stale_writes
flag:Riak prevented the stale write by sending a
412 Precondition Failed
response over HTTP.Secondary Indexes and Links
You can also access the Secondary Indexes andLinks directly from the
RObject
, but we won't coverthose here.What to do next
Congratulations, you finished the 'Key-Value Operations' guide! Afterthis guide, you can go beyond into more advanced querying methods, ortake advantage of extended features of theclient. Secondary Indexes are a very popular feature, and as allgood Rubyists have thorough test suites, the Test Server is alsoa good next step.
Tool for migrating data from one or more buckets in a Riak K/V storeinto another Riak cluster by exporting all data from bucketsto disk, and then allowing the user to load one or more of the dumped bucketsback into another Riak K/V host or cluster.
Intended Usage
This data migrator tool may be helpful in the following scenarios:
- Migrating an entire non-live Riak cluster (no new writes are coming in) to a cluster with a different ring size.
- Exporting the contents of a single bucket to disk (again, on a non-live cluster where no new writes are coming in)
- Exporting a list of keys only, from a particular bucket to a text file (same conditions as above)
- Taking an approximate snapshot of live data for a QA cluster (approximate because, if new objects are writtento the cluster after the export operation starts, they are not guaranteed to be exported in that session)
The app works by performing a streaming List Keysoperation on one or more Riak buckets, and then issuing GETs for the resulting keys (parallelized as much as possible) andstoring the Riak objects (in Protocol Buffer format) on disk.The key listing alone involves multiple iterations through the entire Riak keyspace, and is not intended for frequentusage on a live cluster.On the import side, the app reads the exported Riak objects from files on disk, and issues PUTs to the target cluster.
Do NOT use for regular data backup on a live cluster. Instead, see the Backing Up Riakdocumentation for recommended best practices. Again, the reason for this admonition is: if new data is written to a live clusterafter the export operation is started, that new data is not guaranteed to be included in the export.
Workflow
To transfer data from one Riak cluster to another:
- Before using the migrator tool, make sure that the
app.config
files are the same in both clusters.Meaning, settings such as default quorum values, multi-backend and MDC replication settings, should be the samebefore starting export/import operations. - (Optional) If applicable, transfer custom bucket propertiesusing the
-d -t
options to export from one cluster and-l -t
to import into the target cluster.Note: Do this only if you set custom properties such as non-default quorum values,pre- or post-commit hooks, orset up Search indexingon a bucket. - Export the contents of a bucket (Riak objects) using the
-d
option, to files on disk (the objects will be stored inthe binary ProtoBuf format) - Load the Riak objects from the exported files into the target cluster using the
-l
option.
Downloading:
You can download the ready to run jar file at:http://ps-tools.data.riakcs.net:8080/riak-data-migrator-0.2.4-bin.tar.gz
After downloading, unzip/untar it, and it's ready to run from its directory.
Building from source:
- Make sure Apache Maven is installed
- Build the Riak Data Migrator itself, using Maven. First, fork this project, and
git clone
your fork.
Usage:
Usage:
java -jar riak-data-migrator-0.2.4.jar [options]
Options:
Examples:
Dump (the contents of) all buckets from Riak:
java -jar riak-data-migrator-0.2.4.jar -d -r /var/riak_export -a -h 127.0.0.1 -p 8087 -H 8098
Load all buckets previously dumped back into Riak:
java -jar riak-data-migrator-0.2.4.jar -l -r /var/riak-export -a -h 127.0.0.1 -p 8087 -H 8098
Key Time Realty
Dump (the contents of) buckets listed in a line delimited file from a Riak cluster:
Export only the bucket settings from a bucket named 'Flights':
java -jar riak-data-migrator-0.2.4.jar -d -t -r /var/riak-export -b Flights -h 127.0.0.1 -p 8087 -H 8098
Load bucket settings for a bucket named 'Flights':
java -jar riak-data-migrator-0.2.4.jar -l -t -r /var/riak-export -b Flights -h 127.0.0.1 -p 8087 -H 8098
Copy all buckets from one riak host to another:
java -jar riak-data-migrator-0.2.4.jar -copy -r /var/riak_export -a -h 127.0.0.1 -p 8087 --copyhost 192.168.1.100 --copypbport 8087
Caveats:
-This app depends on the key listing operation in the Riak client whichis slow on a good day.
-The Riak memory backend bucket listing operating tends to timeout ifany significant amount of data exists. In this case, you have toexplicitly specify the buckets you need want to dump using the
-The Riak memory backend bucket listing operating tends to timeout ifany significant amount of data exists. In this case, you have toexplicitly specify the buckets you need want to dump using the
-f
option to specify a line-delimited list of buckets in a file.Riak Generated Key Time Hostname List
Version Notes:
0.2.4-Verbose status output is now default-Added option to turn off verbose output-Logging of final status
0.2.3-Changed internal message passing between threads from Riak Objects to Events for Dump, Load and Copy operations but not Delete.-Added the capability to transfer data directly between clusters-Added the capability to copy a single bucket into a new bucket for the Load or Copy operations.-Changed log level for retry attempts (but not max retries reached) to warn vs error.
0.2.2-Changed message passing for Dump partially to Events-Added logic to count the number of value not founds (ie 404s) when reading-Added summary output for value not founds
Riak Generated Key Time Hostname Server
< 0.2.1 Ancient History